Abstract 

For any positive integer n, let f{n) denote the number of solutions to the Diophantine equation 

4 _ 1 1 1 

n X y z 

with X, y, z positive integers. The Erdos-Straus conjecture asserts that fin) > for every n Js 2. In this paper 
we obtain a number of upper and lower bounds for f(n) or f{p) for typical values of natural numbers n and 
primes p. For instance, we establish that 

N log^ Af < ^ /(p) < iV log^ N log log N. 

, These upper and lower bounds show that a typical prime has a small number of solutions to the Erdos-Straus 

(—K Diophantine equation; small, when compared with other additive problems, like Waring's problem. 
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1. Introduction 

For any natural number n S N = {1,2,...}, let f{n) denote the number of solutions 
(x, y, z) G N'^ to tiie Diopliantine equation 

4 111 , , 

- = - + - + - 1.1 

n X y z 

(we do not assume x,y,z to be distinct or in increasing order). Thus for instance 

/(I) = 0, /(2) = 3, /(3) = 12, /(4) = 10, /(5) = 12, /(6) = 39, /(7) = 36, /(8) = 46, . . . 

We plot the values of f{n) for n ^ 1000, and separately restricting to primes p ^ 1000 in 
Figures [T} [2} 
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Figure 1. The value /(n) for all n < 1000. 

From these graphs one might be tempted to draw conclusions, such as ^^f{n) ^ n infinitely 
often" , that we will refute in our investigations below. 

The Erdos-Straus conjecture (see e.g. [21]) asserts that f{n) > for all n ^ 2; it remains 
unresolved, although there are a number of partial results. The earliest references to this 
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Figure 2. The value f{p) for all primes p < 1000. 

conjecture are papers by Erdos [15j and Oblath [T7], and we draw attention to the fact that 
the latter paper was submitted in 1948. 

Most subsequent approaches hst parametric solutions, which solve the conjecture for n 
lying in certain residue classes. These soluble classes are either used for analytic approaches 
via a sieve method, or for computational verifications. For instance, it was shown by Vaughan 
|77] that the number of n < N for which f{n) = is at most A^exp(— clog ' A^) for some 
absolute constant c > and all sufficiently large N. (Compare also [IHl EHl EHl [HI] for some 
weaker results). 

The conjecture was verified for all n ^ 10^^ in ^] . In Table 111 we list a more complete 
history of these computations, but there may be further unpublished computations as well. 

Most of these previous approaches concentrated on the question whether f{n) > or not. 
In this paper we will instead study the average growth or extremal values of f{n). 

Since we clearly have f{nm) ^ fin) for any n, m G N, we see that to prove the Erdos-Straus 
conjecture it suffices to do so when n is equal to a prime p. 

In this paper we investigate the average behaviour of f{p) for p a prime. More precisely, 
we consider the asymptotic behaviour of the sum 



E/(^) 



where A^ is a large parameter, and p ranges over all primes up to N. As we are only interested 
in asymptotics, we may ignore the case p = 2, and focus on the odd primes p. 



Let us call a solution {x,y,z) to (1.1) a Type I solution if n divides x but is coprime to 
y, z, and a Type II solution if n divides y, z but is coprime to x. Let fi{n), /ii(n) denote the 
number of Type I and Type II solutions respectively. By permuting the x, y, z we clearly have 

/(n)^3/i(n) + 3/ii(n) (1.2) 

for all n > 1. Conversely, when p is an odd prime, it is clear from considering the denominators 

in the Diophantine equation 

4 111 , , 

- = - + - + - 1.3 

p X y z 



5000 


^ 1950 


Straus, see J15j 


8000 


1962 


Bernstein [6] 


20000 


^ 1969 


Shapiro, see [32] 


106128 


1948/9 


Oblath [47] 


141648 


1954 


Rosati [56j 


10'^ 


1964 


Yamomoto [83j 


1.1 X 10^ 


1976 


Joilensten [32j 


10« 


1971 


I'erzi |76| 


109 


1994 


Elsholtz & Roth (unpubUshed) 


IQio 


1995 


Elsholtz & Roth (unpubhshed) 


1.6 X 10" 


1996 


Elsholtz & Roth (unpublished) 


10^0 


1999 


Kotsireas [33^ 


10^4 


1999 


Swett [74J 


2 X 10^4 


2012 


Bello-Hernandez, Benito, Fernandez [5] 



Table 1. Numerical verifications of tlie Erdos-Straus conjecture. It appears that Terzi's set of soluble residue 
classes is correct, but that the set of checked primes in these classes is incomplete. Another reference to 
a calculation up to 10** due to N. Franceschine III (1978) (see [211 I17j and frequently restated elsewhere) 
only mentions Terzi's calculation, but is not an independent verification. We are grateful to I. Kotsireas for 
confirming this. 



that at least one of x, y, z must be divisible by p; also, it is not possible for all three of x, y, z 



to be divisible by p as this forces the right-hand side of (1.3) to be at most 3/p. We thus have 

/(p) = 3/i(p)+3/ii(p) (1.4) 

for all odd primes p. Thus, to understand the asymptotics of Ylp^N f(p)^ ^^ suffices to un- 
derstand the asymptotics of J2p^N Mp) ^^^ Z^psgAr /n(p)- ^^ we shall see. Type II solutions 
are somewhat easier to understand than Type I solutions, but we will nevertheless be able to 
control both types of solutions in a reasonably satisfactory manner. 
We can now state our first main theorem. 

Theorem 1.1 (Average value of /i, fu). For all sufficiently large N, one has the bounds 

Nlog^ N <^Y^ /i(n) < A^log^ N 

Nlog^N ^ Y^ /ii(n) < A^log^iV 

N log2 N <^Yfi{p)<^N log^ N log log N 

7Vlog2 N <.Y, /ii(p) < ^log^ ^• 

Here, we use the usual asymptotic notation X <^Y or X = 0{Y) to denote the estimate 
\X\ ^ CY for an absolute constant C, and use subscripts if we wish to allow dependencies 
on the implied constant C, thus for instance X <Ce Y or X = Os(Y) denotes the estimate 
\X\ ^ CeY for some C^ that can depend on e. We remark that in a previous version of this 



manuscript, the weaker bound X^„^jv fu{p) ^ N log N log log A^ was claimed. As pointed out 
subsequently by Jia [30] , the argument in that previous version in fact only gave Ylp^N /n (p) ^ 
iVlog^ iVloglog^ A'', but can be repaired to give the originally claimed bound J2p^N hii''^) '^ 
N log A^ log log N. These bounds are of course superceded by the results in Theorem |l.l[ 



As a corollary of this and (1.4), we see that 



N log2 Af < ^ /(p) < A^ log2 N log log N. 

From this, the prime number theorem, and Markov's inequality, we see that for any e > 0, we 
can find a subset A of primes of relative lower density at least 1 — e, thus 

hmmf — —T- — ^ 1 — e, (1.5) 

such that f{p) = 0£(log^ploglogp) for all p € A. Informally, a typical prime has only 
0(log p log log p) solutions to the Diophantine equation (1.3); or alternatively, for any function 



^{p) of p that goes to infinity as p — )• oo, one has 0{^{p) log p log log p) for all p in a subset of 
the primes of relative density 1. This may provide an explanation as to why analytic methods 
(such as the circle method) appear to be insufficient to resolve the Erdos-Straus conjecture, 
as such methods usually only give non-trivial lower bounds on the number of solutions to a 
Diophantine equation in the case when the number of such solutions grows polynomially with 
the height parameter A^. (There are however some exceptions to this rule, such as Linnik's 
theorem |37j that every sufficiently large integer is the sum of two primes and a bounded 
number of powers of two, but such results tend to require a large number of summands in 
order to compensate for possible logarithmic losses in the analysis.) 

The double logarithmic factor log log A^ in the above arguments arises from technical lim- 
itations to our method (and specifically, in the inefficient nature of the Brun-Titchmarsh 



inequality (A. 10) when applied to very short progressions), and we conjecture that it should 



be eliminated. 

Remark 1.2. In view of these results, one can naively model f(p) as a Poisson process 
with intensity at least clog p for some absolute constant c. Using this probabilistic model 
as a heuristic, one expects any given prime to have a "probability" 1 — 0(exp(— clog p)) of 
having at least one solution, which by the Borel-Cantelli lemma suggests that the Erdos-Straus 
conjecture is true for all but finitely many p. Of course, this is only a heuristic and does not 
constitute a rigorous argument. (However, one can view the results in |77| /, |J3)/ . based on the 
large sieve, as a rigorous analogue of this type of reasoning.) 



Remark 1.3. From, Theorem 



1.1 



we have the lower bound Yln<N fi''^) ^ A^log'^ A^. In fact 
one has the stronger bound X^„<7v/('^) ^ Nlog N (Heath-Brown, private communication) 



using the methods from \24^ ; see Remark 2.10 for further discussion. Thus, for composite n. 



most solutions are in fact neither of Type I or Type II. It would be of interest to get matching 
upper bounds for Yl,n<N f i''^) > ^'"^ ^^^'^ seems to be beyond the scope of our methods. It would 
of course also be interesting to control higher moments such as Ylp^N hip)^ ^''" 'l2p<N /n(p)'^' 
but this also seems to unfortunately lie out of reach of our methods, as the level of the relevant 
divisor sums becomes too great to handle. 



To prove Theorem |1.1[ we first use some solvability criteria for Type I and Type II solutions 
to obtain more tractable expressions for flip) and /ii(p). As we shall see, fi{p) is essentially 
(up to a factor of two) the number of quadruples (a, c, d, f) G N^ with Aacd = p + f , f dividing 
Aa^d + 1, and acd ^ 3p/4, while fii{p) is essentially the number of quadruples (a, c, d, e) G N^ 
with Aacde = p + Aa^d + e and acde ^ 3p/2. (We will systematically review the various known 
representations of Type I and Type II solutions in Sectional) This, combined with standard 
tools from analytic number theory such as the Brun-Titchmarsh inequality and the Bombieri- 
Vinogradov inequality, already gives most of Theorem |1.1[ The most difficult bound is the 
upper bounds on /i, which eventually require an upper bound for expressions of the form 

for various A, B, k, where T{n) := X^dln -'- ^^ ^^^ number of divisors of n, and d \ n denotes the 
assertion that d divides n. By using an argument of Erdos [TO], we obtain the following bound 
on this quantity: 

Proposition 1.4 (Average value of T{kab'^ + 1)). For any A,B>1, and any positive integer 
k < {AB)^^^\ one has 

E E ^(^"^^ + 1)<^AB log{A + B) log(l + k). 



Remark 1.5. Using the heuristic that T{n) ~ logn on the average (see (A.5)J, one expects 
the true hound here to he 0{AB\og{A + B)). The log(l + k) loss can he reduced (for some 
ranges ofA,B,k, at least) hy using more tools (such as the Poly a- Vinogradov inequality), hut 
this slightly inefficient hound will he sufficient for our applications. 



We prove Proposition 1.4 (as well as some variants of this estimate) in SectionlTJ Our main 



tool is a more quantitative version of a classical bound of Erdos [16] on the sum Y2n<N ''"(^(^)) 
for various polynomials P, which may be of independent interest; see Theorem |7.1[ 

We also collect a number of auxiliary results concerning the quantities fi{n), some of which 
were in previous literature. Firstly, we have a vanishing property at odd squares: 

Proposition 1.6 (Vanishing). For any odd perfect square n, we have fi{n) = fii{n) = 0. 

This observation essentially dates back to Schinzel (see |21| . |42j . [63]) and Yamomoto (see 



[83]) and is an easy application of quadratic reciprocity (A.7): for the convenience of the 



reader, we give the proof in SectionEl A variant of this proposition was also established in [5]. 



Note that this does not disprove the Erdos-Straus conjecture, since the inequality (1.2) does 
not hold with equality on perfect squares; but it does indicate a key difficulty in attacking this 
conjecture, in that when showing that fi{p) or fii{p) is non-zero, one can only use methods 
that must necessarily fail when p is replaced by an odd square such as p^ , which already rules 
out many strategies (e.g. a finite set of covering congruence strategies, or the circle method). 
Next, we establish some upper bounds on fi{n),fii{n) for fixed n: 

Proposition 1.7 (Upper bounds). For any n G N, one has 

/l(n)«n3/5+0(i/ log logn) 



and 

/ii(n)«n2/5+0(i/i°siogn)_ 



In particular, from this and (1.4) one can conclude that for any prime p one has 

fip) <p3/5+0(l/loglogp)^ 

This should be compared with the recent result in j8], which gives the bound f{n) <Ce n'^'^^'^ 
for all n and all e > 0. For composite n the treatment of parameters dividing n appears to 
be more complicated and here we concentrate on those two cases that are motivated by the 
Erdos-Straus equation for prime denominator. 

We prove this proposition in Section [3j 

The main tools here are the multiple representations of Type I and Type II solutions 
available (see Section [2]) and the divisor bound (A.6). The values of f{p) appear to fluctuate 



in some respects as the values of the divisor function. The average values of f{p) behave much 
more regularly. 

Moreover, in view of Theorem |1.1[ one might also expect to have f{n) <^g n^ for any 
£ > 0, but such logarithmic- type bounds on solutions to Diophantine equations seem difficult 
to obtain in general (Proposition |1.7| appears to be the limit of what one can obtain purely 



from the divisor bound (A.6) alone). 



In the reverse direction, we have the following lower bounds on f{n) for various sets of n: 
Theorem 1.8 (Lower bounds). For infinitely many n, one has 

logn 



/(n)^exp((log3 + o(l)) 



log log n ' 



where o(l) denotes a quantity that goes to zero as n ^ 00. 
For any function i{n) going to +00 as n —)• 00, one has 

f{n) ^ exp I — - — log logn — 0(^(n)y log logn) J S> (logn^ 



0.549 



for all n in a subset A of natural numbers of density 1 (thus \Ar\ {1, . . . , A^}|/A^ — )• 1 as 
iV-> ooj. 

Finally, one has 

log3 „/-,xM__.n„__„A .. n_„.N0.549 



f{p) ^ exp ( {— 0(1)) loglogp j > (logp) 

for all primes p in a subset B of primes of relative density 1 (thus \{p ^ B : p ^ N^\/\{p : p ^ A^}| 
1 as N ^ ooj. 

As the proof shows the first two lower bounds are already valid for sums of two unit 
fractions. The result directly follow from the growth of certain divisor functions. An even 
better model for /(n) is a suitable superposition of several divisor functions. The proof will 
be in Section [HI 

Finally, we consider (following |42j . [63]) the question of finding polynomial solutions to 



(1.1). Let us call a primitive residue class n = r mod q solvable by polynomials if there exist 



polynomials Pi{n),P2{n),P3{n) which take positive integer values for all sufficiently large n 
in this residue class (so in particular, the coefficients of Pi,P2, P3 are rational), and such that 

4 _ 1 1 1 

n~ Pi{n) ^ P2{n) ^ P^in) 

for all n. Here we recall that a residue class r mod q is primitive if r is coprime to q. One 
could also consider non-primitive congruences, but these congruences only contain finitely 
many primes and are thus of less interest to solving the Erdos-Straus conjecture (and if the 
Erdos-Straus conjecture held for a common factor of r and q, then the residue class r mod q 
would trivially be solvable by polynomials. 

By Dirichlet's theorem, the primitive residue class r mod q contains arbitrarily large 
primes p. For each large prime p in this class, we either have one or two of the Pi (p) , P2 {p) , -P3 (p) 
divisible by p, as observed previously. For p large enough, note that Pi{p) can only be divisible 
by p if there is no constant term in Pj. We thus conclude that either one or two of the Pi{n) 
have no constant term, but not all three. Let us call the congruence Type I solvable if one can 
take exactly one of Pi,P2,P3 to have no constant term, and Type II solvable if exactly two 
have no constant term. Thus every solvable primitive residue class r mod q is either Type I 
or Type II solvable. 

It is well-known (see |47[ I42j ) that any primitive residue class n = r mod 840 is solvable 
by polynomials unless r is a perfect square. On the other hand, it is also known (see [12], |63j ) 
that a primitive congruence class n = r mod q which is a perfect square, cannot be solved 
by polynomials (this also follows from Proposition 1.6). The next proposition classifies all 
solvable primitive congruences. 

Proposition 1.9 (Solvable congruences). Let q mod r be a primitive residue class. If this 
class is Type I solvable by polynomials, then all sufficiently large primes in this class belong to 
one of the following sets: 

• {n = —f mod Aad}, where a, d, / G N are such that /|4a^(i -|- 1. |^6f 

• {n = — / mod 4ac} D {n = —c/a mod /}, where a, c, / G N are such that (4ac, /) = 1. 

• {n = — / mod 4c(i}n{n^ = —Ac^d mod /}, where c, d, / G N are such that (Acd, /) = 1. 

• {n = — 1/e mod 4a6}, where a,b,e gN are such that e \ a + b and (e,4a6) = 1. |I]/, /56|/ 

Conversely, any residue class in one of the above four sets is solvable by polynomials. 

Similarly, q mod r is Type II solvable by polynomials if and only if it is a subset of one of 
the following residue classes: 

• —e mod 4a6, where a,b,e gN are such that e \ a + b and (e, 4a6) = 1. fljl 

• -Aa^d mod /, where a,d,feN are such that Aad | / + 1. [T^, JMi 

• —Aa^d — e mod Aade, where a, d, e G N are such that {Aad., e) = 1. \4(^ 

As indicated by the citations, many of these residue classes were observed to be solvable by 
polynomials in previous literature, but some of the conditions listed here appear to be new, 
and they form the complete list of all such classes. We prove Proposition 1.9 in Section 10 



Remark 1.10. The results in this paper would also extend (with minor changes) to the more 



general situation in which the numerator 4 in (1.3) is replaced by some other fixed positive 



integer, a situation considered first by Sierpiiiski and Schinzel (see e.g. f6^ [77| , \4S[ \4^ \2^)- 

We will not detail all of these extensions here but in Section [ll] we extend our study of the 
average number of solutions to the more general question on sums of k unit fractions 

mill , ^ 

- = - + - + ••• + -. (1.6) 

n ti t2 tk 

If m > k ^ 3, and the ti are positive integers, then it is an open problem if for each sufficiently 
large n there is at least one solution. The Erdos-Straus conjecture with m = 4, A; = 3, discussed 
above, is the most prominent case. If m and k are fixed, one can again establish sets of residue 



classes, such that (1.6) is generally soluble if n is in any of these residue classes. 



The problem of classifying solutions of (1.6) has been studied by Rav [SS], Sos [7T] and 



Elsholtz [13]. Moreover Viola [78j, Shen j67j and Elsholtz [14j have used a suitable subset 
of these solutions to give (for fixed m > k ^ 3) quantitive bounds on the number of those 
integers n ^ N, for which [LH] does not have any solution. 

We will focus on the case of Type II solutions, in which t2, . . . ,tk are divisible by n. The 
classification of solutions that we give below also works for other divisibility patterns, but Type 
II solutions are the easiest to count, and so we shall restrict our attention to this case. Strictly 
speaking, the definition of a Type II solution here is slightly different from that discussed 
previously, because we do not require that ti is coprime to n. However, this coprimality 



automatic when n is prime (otherwise the right-hand side of (1.6) would only be at most 



k/n). For composite n, it is possible to insert this condition and still obtain the lower bound 



(1.7), but this would complicate the argument slightly and we have chosen not to do so here. 
For given m,k,n, let /m,fc,ii('^) denote the number of Type II solutions. Our main result 
regarding this quantity is the following lower bound on this quantity: 

Theorem 1.11. Let m > k ^ 3 be fixed. Then, for N sufficiently large, one has 

Y. fm,k,nin) y^m,k N{logNf'"~^ (1.7) 

and 

V- . . ^ 7V(logiV)2'"'-2 

^N log log iV 

Our emphasis here is on the exponential growth of the exponent. In particular, as k 
increases by one, the average number of solutions is roughly squared. The denominator of 
log log A^ is present for technical reasons (due to use of the crude lower bound (A. 11) on 
the Euler totient function), and it is likely that it could be eliminated (much as it is in the 
m = A,k = 3 case) with additional effort. 



Remark 1.12. If we let fm,k{n) be the total number of solutions to (1.6) (not just Type II 
solutions), then we of course obtain as a corollary that 



We do not expect the power of the logarithm to be sharp in this case (cf. Remark 2.10). For 
instance, in \27^ it is shown that 



for any fixed m. 



^^-=<"' = (*R^'"") "'"'"" 



Note that the equation (1.6) can be rewritten as 



1 1 1 

+ ...+ + = 0, 



niti nitk —n 

which is primitive when n is prime. As a consequence, we obtain a lower bound for the number 
of integer points on the (generahsed) Cayley surface: 

Corollary 1.13. Let k ^ 3. The number of integer points of the following generalization of 
Cayley 's cubic surface, 



1 



i=0 

with ti non-zero integers with miuj \ti\ ^ N, is at least CkNilogN)'^ ^^/ log log A^ for some 
Cfc > depending only on k. 

Again, the double logarithmic factor should be removable with some additional effort, 
although the exponent 2 — 2 is not expected to be sharp, and should be improvable also. 

Part of the first author's work on ths project was supported by the German National Merit 
Foundation. The second author is supported by a grant from the MacArthur Foundation, 
by NSF grant DMS-0649473, and by the NSF Waterman award. The authors thank Nicolas 
Templier for many helpful comments and references, and the referee and editor for many useful 
corrections and suggestions. The first author is very grateful to Roger Heath-Brown for very 
generous advice on the subject (dating back as far as 1994). Both authors are particularly 



indebted to him for several remarks (including Remark 2.10), and also for contributing some of 
the key arguments here (such as the lower bound on X^n^Af /nl"-) ^^d X^p^jv /n(p)) which have 
been reproduced here with permission. The first author also wishes to thank Tim Browning, 
Ernie Croot and Arnd Roth for discussions on the subject. 

2. Representation of Type I and Type II solutions 

We now discuss the representation of Type I and Type II solutions. There are many such 
representations in the literature (see e.g. [1], [S], [B|, |3B], [55], [56], |[77j, 1^801); we will remark 
how each of these representations can be viewed as a form of the one given here after describing 
a certain algebraic variety in coordinates. 

For any non-zero complex number n, consider the algebraic surface 

Sn '■= {{x, y, z) E C : 4xyz = nyz + nxz + nxy} C C . 

Of course, when n is a natural number, f{n) is nothing more than the number of N-points 
(x, y, z) G Sn n N'^ on this surface. 

10 



It is somewhat inconvenient to count N-points on Sn directly, due to the fact that x,y,z 
are hkely to share many common factors. To ehminate these common factors, it is convenient 
to hft Sn to higher-dimensional varieties S„, E„ (and more specifically, to three-dimensional 
varieties in C^), which are adapted to parameterising Type I and Type II solutions respectively. 
This will replace the three original coordinates x, y, z by six coordinates a, b, c, d, e, /, any three 
of which can be used to parameterise S„. or S„ . This multiplicity of parameterisations will 
be useful for many of the applications in this paper; rather than pick one parameterisation 
in advance, it is convenient to be able to pick and choose between them, depending on the 
situation. 

We begin with the description of Type I solutions. More precisely, we define S„ to be the 
set of all sextuples (a, b, c, d, e, /) E C^ which are non-zero and obey the constraints 

Aabd = ne + l (2.1 

ce = a + b (2.2 

iabcd = na + nb + c (2.3 

4ac(ie = ne + ia^d + 1 (2.4 

Abode = ne + Ab^d + 1 (2.5 

Aacd = n + f (2.6 

ef = Aa^d + l (2.7 

bf = na + c (2.8 

n^ + Ac^d = f{4bcd -n). (2.9 

Remark 2.1. There are multiple redundancies in these constraints; to take just one example, 
(2.9) follows from (2.3) and (2.6). One could in fact specify S„ using just three of these nine 



constraints if desired. However, this redundancy will be useful in the sequel, as we will be 
taking full advantage of all nine of these identities. 



The identities ( |2.1| )-(2.9) form an algebraic set that can be parameterised (perhaps up to 
some bounded multiplicity) by fixing three of the six coordinates a, b, c, d, e, f and solving for 
the other three coordinates. For instance, using the coordinates a, c, d, one easily verifies that 

T f na + c Aa^d + 1 . , r ^s . r / 

2j = < (a, ; , c, a, ; , 4aca — n) : a,c,d ^ L. ; Aacd ^ n 

y Aacd — n Aacd — n 

and similarly for the other („) — ! = 14 choices of three coordinates; we omit the elementary 
but tedious computations. Thus we see that Sj^ is a three-dimensional algebraic variety. From 



(2.3) we see that the map 

7r„ : (a, b, c, d, e, f) i— )■ [abdn, acd, bed) 

maps Sjj to Sn. After quotienting out by the dilation symmetry 

(a, b, c, d, e, f) ^ {Xa, Xb, Ac, A^^d, e, /) (2.10) 

of S^, this map is injective. 

If n is a natural number, then n^ clearly maps N-points of S^ to N-points of Sn, and if c 
is coprime to n, gives a Type I solution (note that abd is automatically coprime to n, thanks 



to (2.1)). In the converse direction, all Type I solutions arise in this manner: 

11 



Proposition 2.2 (Description of Type I solutions). Let n G N, and let (x,y,z) be a Type 
I solution. Then there exists a unique (a, b, c, d, e, /) G N^ n S^ with abed coprime to n and 
a, b, c having no common factor, such that 7r„(a, 6, c, d, e, /) = (x, y, z). 

Proof. The uniqueness follows since t:\ is injective after quotienting out by dilations. To 
show existence, we factor x = ndx',y = dy',z = dz', where x',y',z' are coprime, then after 



multiplying (1.1) by ndx y z we have 



A J I I I II, II, II 

4ax y z = y z + nx y + nx z 



(2.11) 



As y', z' are coprime to n, we conclude that x' divides y' z' , y' divides x' z' , and z' divides x'y' 
Splitting into prime factors, we conclude that 



ab,y 



ac, z 



be 



(2.12) 



for some natural numbers a, b, c; since x' , y' , z' have no common factor, a, b, c have no common 
factor also. As y,z were coprime to n, abed is coprime to n also. 



Substituting (2.12) into (2.11) we obtain ( |2.3[ ), which in particular implies (as c is coprime 
to n) that c divides a + b. If we then set e := (a + b)/c and / := Aacd — n = {na + c)/b, then 



e, / are natural numbers, and we obtain the other identities (2.1 )-(2.9) by routine algebra. By 
construction we have vr^(a, b, c, d, e, f) = (x, y, z), and the claim follows. D 

In particular, for fixed n, a Type I solution exists if and only if there is an N-point 
(a, 6, c, d, e, /) of S^ with abed coprime to n (the requirement that a, 5, c have no common 
factor can be removed using the symmetry (2.10)). By parameterising Sj^ using three or four 
of the six coordinates, we recover some of the known characterisations of Type I solvability: 

Proposition 2.3. Let n be a natural number. Then the following are equivalent: 

• There exists a Type I solution {x,y,z). 

• There exists a, 6, e G N with e \ a + b and Aab \ ne + 1. fl^ 

• There exists a,b,c,d gN such that Aabcd = na + nb + c with c coprime to n. |^ 

• There exist a, c, d, e G N such that ne + 1 = 4ad{ce — a) with c coprime to n. ]5(^, 4^^ 



• There exist a,c,d,f G N such that n = Aacd — f and f \ Aa'^d + 1, with c coprime to n. 

m 

• There exist b, c, d, e with ne = (Abcde — 1) — Ab'^d and c coprime to n. J^ 

The proof of this proposition is routine and is omitted. 

Remark 2.4. Type I solutions (x, y, z) have the obvious reflection symmetry (x, y, z) i— t- 
{x,z,y). With (2.6) and (2.9) the corresponding symmetry for T}^ is given by 



(a, 6, c, d, e, /) I— )• I 6, a, c, d, e. 



n^ + Ac^d 
1 



We will typically only use the S„ parameterisation when y ^ z (or equivalently when a ^ b), 
in order to keep the sizes of various parameters small. 
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Remark 2.5. If we consider N -points {a,b,c,d,e,f) of Yl\ with a = 1, they can be explicitly 
parameterised as 

ef-1 



1, ce — 1, c, 



,e,f 



where e, f are natural numbers with ef = 1 mod 4 and n = cef — c — f. This shows that 
any n of the form cef — c — f with ef = 1 mod 4 solves the Erdos-Straus conjecture, an 
observation made in |^. However, this is a relatively small set of solutions (corresponding to 
roughly log n solutions for a given n on average, rather than log n), due to the restriction 
a = 1. Nevertheless, in |^ it was verified that all primes p = 1 mod 4 with p ^ 10 
representable in this form. 
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were 



Now we turn to Type II solutions. Here, we replace Sj^ by the variety Tj^, as defined the 
set of all sextuples (a, b, c, d, e, /) E C^ which are non-zero and obey the constraints 



Aabd 

ce 

Aabcd 

Aacde 

Abcde 

Aacd 

ef 
bf 



n + e 

a + b 

a + b + nc 

n + 4a^d + e 

n + Ab'^d + e 

/ + 1 

n + ia'^d 

nc + a 

f {Abed -I). 



(2.13) 
(2.14) 
(2.15) 
(2.16) 
(2.17) 
(2.18) 
(2.19) 
(2.20) 
(2.21) 



Ac^dn + 1 
This is a very similar variety to Sj^; indeed the non-isotropic dilation 

(a, 6, c, d, e, /) i— ;■ (a, 6, c/n , dn, n e, //n) 

is a bijection from S^ to S^^. Thus, as with S^, S^^ is a three-dimensional algebraic variety in 
C^ which can be parameterised by any three of the six coordinates in (a, b, c, d, e, /). As before, 



many of the constraints can be viewed as redundant; for instance, (2.21) is a consequence of 



has the reflection symmetry (using (2.18) and (2.21)) 



(2.15) and (2.18). Note that Sj^^ enjoys the same dilation symmetry (2.10) as Sj^, and also 



(a, 6, c, d, e, /) I— ;■ ( 5, a, c, d, e. 



Ac^dn + 1 
7 

Analogously to vr^, we have the map vrj^^ : Sj^^ — ;■ Sn given by 

7r„ : (a, 6, c, d, e, /) i— )• (aM, acdn, bcdn) 



(2.22) 



which is injective up to the dilation symmetry ( 2.10| ) and which, when n is a natural number, 
maps N-points of T}^ to N-points of Sn-, and when abd is coprime to n, gives Type II solutions. 
(Note that this latter condition is automatic when n is prime, since x, y, z cannot all be 
divisible by n.) 



We have an analogue of Proposition 2.2 
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Proposition 2.6 (Description of Type II solutions). Let n G N, and let {x,y,z) be a Type 
II solution. Then there exists a unique (a, b, c, d, e, /) G N^ Pi HH with abd coprime to n and 
a, b, c having no common factor, such that 7r„(a, b, c, d, e, f) = (x, y, z). 

Proof. Uniqueness follows from injectivity modulo dilations of ttJ^^ as before. To show existence, 



we factor x = dx' , y = ndy' , z = ndz', where x' ,y' , z' are coprime, then after multiplying ( 1.1 ) 
by ndx'y'z' we have 



A 1 I I I II, II, II 

'idx y z = ny z + x y + x z . 



(2.23) 



As x' are coprime to n, we conclude that x' divides y'z', y' divides x'z', and z' divides 



X y . Splitting into prime factors, we again obtain the representation (2.12) for some natural 



numbers a, b, c; since x', y' , z' have no common factor, a, b, c have no common factor also. As 
X was coprime to n, abd is coprime to n also. 



Substituting (2.12) into (2.23) we obtain (2.15), which in particular implies that c divides 



a + b. If we then set e := (a + b)/c and / := 4ac(i — 1, then e, f are natural numbers. 



and we obtain the other identities ( 2.13 )-( 2.21) by routine algebra. By construction we have 
TTn{a, b, c, d, e, /) = (x, y, z), and the claim follows. D 

Again, we can recover some known characterisations of Type II solvability: 
Proposition 2.7. Let n be a natural number. Then the following are equivalent: 

• There exists a Type II solution {x,y,z). 

• There exists a, 6, e G N with e \ a + b and Aab \ n-\- e, and (n + e)/4 coprime to n. JT^ 

• There exists a,b,c,d £N such that Aabcd = a-\-b + nc with abd coprime to n. |^ |^^ 

• There exists a,b,d £N with 4abd — 1 \ b-\- nc with abd coprime to n. 177 



• There exist a,c,d,e G N such that n = {Aacd — l)e — Aa^d with (n + e)/4 coprime to n. 

m 

• There exist a,c,d,f G N such that n = Aad{ce — a) — e = e{Aacd—l)—Aa^d with ad{ce — a) 
coprime to n. J^fi) / 

Next, we record some bounds on the order of magnitude of the parameters a, 6, c, d, e, / 
assuming that y ^ z. 

Lemma 2.8. Let n G N, and suppose that {x,y,z) = 7r^(a, 6, c, d, e, /) is a Type I solution 
such that y ^ z. Then 

as^b 

1 , 3 

-n < acd ^ -n 

4 4 

6 < ce ^ 26 

an ^ bf ^ -an. 
■'3 
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// instead (x, y, z) 



T^n{a, b, c, d, e, f) is a Type II solution such that y ^ z, then 

a ^ 6 
1 



-n < acde ^ n 
4 

6 < ce ^ 26 
3acd ^ / < 4ac(i 

Informally, the above lemma asserts that the magnitudes of the quantities (a, b, c, d, e, f) 
are controlled entirely by the parameters (a, c, d, f) (in the Type I case) and (a, c, d, e) (in the 
Type II case), with the bounds acd ~ n, / <C n in the Type I case and acde ~ n in the Type 
II case. The constants in the bounds here could be improved slightly, but such improvements 
will not be of importance in our applications. 



Proof. First suppose we have a Type I solution. As y ^ z, we have a ^b. From (2.2) we then 



have b < ce ^ 26, and thus from (2.8) we have 



an ^ bf ^ an -\ -bf. 

e/ 



Now, from (2.7), ef = 1 mod 4. If e = / = 1, then from (2.2) and (2.8) we would have 
b = na + c = na + a + b, which is absurd, thus ef ^ 5. This gives bf ^ 5an/3 as claimed. 



From (2.8) this implies that c ^ 2an/3, which in particular implies that bed < abdn and so 



y ^ z < X. From (1.1) we conclude that 



4 1 4 
dn y n 

which gives the bound n/4 < acd ^ 3n/4 as claimed. 

Now suppose we have a Type II solution. Again a ^ 6 and 6 < ce ^ 26. From (2.15) we 
have 

nc < Aabcd ^ nc + 2abcd 

and thus n/4 < abd ^ n/2, which by the ce bound gives n/4 < acde ^ n. Since / = 4acd — 1, 
we have 3acd ^ / < 4acfi, and the claim follows. D 

Remark 2.9. From the above bounds one can also easily deduce the following observation: 
if 4/p = 1/x + 1/y + 1/z, then the largest denominator max(x, y, z) is always divisible by p. 
(This observation also appears in U3^ -) 



Remark 2.10. Propositions 2.2, \2.6\ can be viewed as special cases of the classification by 
Heath-Brown \2^ of primitive integer points (xi,X2,X3,X4) € (Z\{0})^ on Cayley's surface 



1111 

— + — + — H 

Xi X2 X3 X4 







(xi,X2,X3,X4) 

where by "primitive" we mean that xi, X2, X3, X4 have no common factor. Note that if n, x, y, z 



solve (1.1), then {—n,Ax,4:y,Az) is an integer point on this surface, which will be primitive 
when n is prime. In \24\ Lemma 1] it is shown that such integer points (xi, X2, X3,X4) take 
the form 

Xi = eyjykyiZijZikZii 
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for {i,j,k,l} = {1,2,3,4}, where e G { — 1,+1} is a sign, and the yi,Zij are non-zero integers 
obeying the coprimality constraints 

ivi^yj) = izij,zki) = {yi,zij) = 1 

for {i,j,k,l} = {1,2,3,4}, and obeying the equation 

y^ yiZjkZkizij = 0. (2.24) 

{j,j,fc,«}={l,2,3,4} 

Conversely, any e,yi,Zij obeying the above conditions induces a primitive integer point on 
Cayley's surface. The Type I (resp. Type II) solutions correspond, roughly speaking, to the 
cases when one of the zu (resp. one of the yi) in the factorisation 

n = xi = ey2y3y'iZi2Zi3Zu 

are equal to in. The yi,Zij coordinates are closely related to the {a,b,c,d,e, f) coordinates 
used in this section; in \24^ it is observed that these coordinates obey a number of algebraic 



equations in addition to (2.24), which essentially describe (the closure of) the universal torsor 
JW^ of Cayley 's surface. 

In ]2^ it was shown that the number of integer points (xi,X2,X3,X4) on Cayley's surface 
of maximal height max(|xi|, . . . , |x4|) bounded by N was comparable to Nlog N. This is not 



quite the situation considered in our paper; a solution to (1.1) with n ^ N induces an integer 
point {xi,X2,X3,X4) whose minimal height min(|xi|, . . . , |x4|) is bounded by N. Nevertheless, 
the results in J2^ can be easily modified (by minor adjustments to account for the restriction 
that three of the xi are positive, and restricting n to be a multiple of 4 to eliminate divisibil- 
ity constraints) to give a lower bound X]„<^ fi"^) ^ A^log N for the number of such points, 
though it is not immediately obvious whether this lower bound can be matched by a correspond- 
ing upper bound. Nevertheless, we see that there are several logarithmic factors separating the 
general solution count from the Type I and Type II solution count; in particular, for generic 



n, the majority of solutions to (1.1) will neither be Type I nor Type II. In spite of this, the 
number of Type I and Type II solutions is the relevant quantity for studying the Erdos-Straus 
conjecture, as it naturally to study it for prime denominators only. 

We close this section with a small remark on the well known standard classification of 
solutions in Mordell's book: His two cases (in his notation) 

mil 1 

+ ; + 



p abd acd bcdp 
with (a, b) = (a, c) = (6, c) = 1 and p \ abed and 

mil 1 

+ — ^ + 



p abd acdp bcdp 

(a, b) = (a, c) = {b, c) = 1 with p f abd suggest that p \ c might be possible. Here we prove, for 
m > 3 and p coprime to m, that none of the denominators can be divisible by p^. In particular 
p f abed in both of the cases above. 

Proposition 2.11. Let m/p = 1/x + 1/y + 1/z where m > 3, p is a prime not dividing m, 
and x,y,z are natural numbers. Then none of x,y,z are divisible by p^ . 
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Note that there are a small number of counterexamples to this proposition for m ^ 3, such 

as 3/2 = 1/1 + 1/4 + 1/4. 

Proof. We may assume that {x, y, z) is either a Type I or Type II solution (replacing 4 by m 
as needed). In the Type I case (x, y, z) = {abdp, acd, bed), the claim is already clear since abed 
is known to be coprime to p. In the Type II case (x, y, z) = {abd, acdp, bcdp) it is known that 
abd is coprime to p, so the only remaining task is to establish that c is coprime to p also. 
Suppose c is not coprime to p; then y, z are both divisible by p^. In particular 

2 

T2 



1 1 

- + - < 

y z p^ 



and hence 



m 1 m 
> ^ 

p X p 



o2' 



Taking reciprocals, we conclude that 



p < mx ^ p{l 



mp 



Bounding (1 — e) ^ < 1 + 2e when < e < 1/2, we conclude that 

4 



p < mx < p + 



m 



But if m > 4, this forces mx to be a non-integer, a contradiction. 



D 



3. Upper bounds for fi{n) 



We may now prove Proposition |1.7[ 

We begin with the bound for fi{n). By symmetry we may restrict attention to Type I 



solutions {x,y,z) for which y ^ z. By Proposition 2.2 and Lemma 2.8, these solutions arise 
from sextuples (a, b, c, d, e, /) G N^nSjj obeying the Type I bounds in Lemma 2.8 In particular 
we see that 



f-icd)' 



ac 



2/Ce 6/ 3 



(acd)^(-)(^)«n 
a 



and hence at least one of e, /, cd, ac is 0{n^'^). 

Suppose first that e <^ n^'^. For fixed e, we see from (2.1) and the divisor bound (A. 6) 
that there are n'^(^/'°s'°s") choices for a,b,d, giving a net total of n^/5+0(i/iogiogn) pQJj^^g [-^ 



S„ in this case. 



Similarly, if / ^ n^'^, (2.6) and the divisor bound gives n^(^/^°s^os"-) choices for a,c,d for 



each /, giving 7i3/5+0(i/iogiogn) solutions. If cd <^ v?'^ , one uses (2.9) and the divisor bound 



to get n'^(^'^°s'°s") choices for 6, /, c, d for each choice of cd, and if ac <C n^'^ , then (2.8) and 



the divisor bound gives n^^' ^log") choices for a,b,c,f for each fixed ac. Putting all this 
together (and recalling that any three coordinates in T,\ determine the other three) we obtain 



the first part of Proposition 1.7 



Now we prove the bound for /n(n), which is similar. Again we may restrict attention to 
sextuples (o, b, c, d, e, /) G N^ n Ti^ obeying the Type II bounds in Lemma 2.8, In particular 
we have 



(ad) ■ {ac) ■ {cd) 



{acde)"^ ^ n^ 
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and so at least one of e, ad, ac, cd is 0{n'^'^). 

If e <C n^'^, we use (2.13) and the divisor bound to get n^^''°^'°^"^ choices for a,b,d for 



each e. If ad <C n^'^, we use (2.19) and the divisor bound to get n '^' ^^ ^^"^ choices for 



a, d, e, / for each fixed ad. If ac <^ n^/^, we use (2.20) to get n'-^(^/^°s^ogn) (jj^^QJ^es for a, c, 6, / 
for each fixed ac. If cd <C n^'^, we use (2.21) and the divisor bound to get n*^'^'^°§'°s"') 



choices for b, c, d, f for each fixed cd. Putting aU this together we obtain the second part of 
Proposition |1.7[ 



Remark 3.1. This argument, together with the fact that a large number n can be factorised 
in expected 0{n°''^') time (using, say, the quadratic sieve 152^). gives an algorithm to find all 
Type I solutions for a given n in expected run time 0{n^'^~^°^^'), and an algorithm to find all 
the Type II solutions in expected run time 0{n'^'^^°^^'). 



4. Insolubility for odd squares 

We now prove Proposition |1.6[ Suppose for contradiction that n is an odd perfect square 



(in particular, n = 1 mod 8) with a Type I solution. Then by Proposition 2.2, we can find 
an N-point (a, 6, c, d, e, /) in EJ^. 



Let q be the largest odd factor of ab. From (2.1 ) we have ne + 1 = mod q. Since n is a 
perfect square, we conclude that 



-1 



(-1) 



(9-l)/4 



reciprocity (A. 7) we thus have 



thanks to (|A.8|). Since n = 1 mod 8, we see from (2.1) that e = 3 mod 4. By quadratic 

■'^^ 1. 



On the other hand, from (2.2) we see that ab = —a^ mod e, and thus 

-1^ 



-1 



by (A. 8). This forces ab ^ q, and so (by definition of q) ab is even. By (2.1), this forces e = 7 



mod 8, which by (A.9) implies that 



and thus 



a contradiction. 



The proof in the Type II case is almost identical, using (2.13), (2.14) in place of (2.1) 



(2.2); we omit the details. 
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5. Lower bounds I 

Now we prove the lower bounds in Theorem |1.1[ 
We begin with the lower bound 



^/ii(n)>iVlog3Ar. (5.I) 

Suppose a, c, d, e are natural numbers with d square-free, e coprime to ad, e > a, and 
acde ^ N/A. Then the quantity 

n := 4acde - e- Aa^d (5.2) 

is a natural number of size at most N, and {a,ce — a,c,d,e,Aacd — 1) is an N-point of Sj^ . 
Applying tt^, we obtain a solution 

(x, y, z) = {a{ce — a)d, acdn, (ce — a)cdn) 



to ( 1.1 ). We claim that this is a Type II solution, or equivalently that a{ce — a)d is coprime to 



n. As e is coprime to ad, we see from (5.2) that n is coprime to ade, so it suffices to show that 



n is coprime to b := ce — a. But if g is a common factor of both n and b, then from the identity 



(2.20) (with / = Aacd — 1) we see that q is also a common factor of a, a contradiction. Thus 
we have obtained a Type II solution. Also, as d is square-free, any two quadruples (a, c, d, e) 
will generate different solutions, as the associated sextuples (a, ce — a, c, d, e, Aacd — 1) cannot 



be related to each other by the dilation (2.10). Thus, it will suffice to show that there are 
at least 6N log^ A^ quadruples (a, c, d, e) G N with d square-free, e coprime to ad, e > a, and 
acde ^ N/A for some absolute constant 6 > 0. Restricting a,c,d to be at most A^*^'"*^ (say), 
we see that the number of possible choices of e is at least 6'{N/acd)(p{ad)/ad, where (/> is the 
Euler totient function and 5' > is another absolute constant. It thus suffices to show that 

where n is the Mobius function (so fjp{d) = 1 exactly when d is square- free) . Using the 
elementary estimate (j){ad) ^ (j){a)(j){d) and factorising, we see that it suffices to show that 

-^ ''W'*W»logiV. (5.3) 



^^ d^ 



But this follows from Lemma lA. II 
Now we prove the lower bound 



Y,h{n)^N\og^N, 



which follows by a similar method. 

Suppose a, c, d, f are natural numbers with d square-free, / dividing Aa'^d -\- 1 and coprime 
to c, d ^ f , and acd ^ N/A. Then the quantity 

n := Aacd - f (5.4) 
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is a natural number which is at most N, and {a,b,c,d,Aa'^d+ 1//,/) is an N-point of Sjj, 

where 

Aa'^d + 1 na + c 

6:=c— ^ ' = -^' 

Applying tt\, this gives a solution 

(x, y, z) = {abdn, acd, bed) 



to ( 1.1 ), and as before the square-free nature of d ensures that each quadruple (a, c, d, f) gives 
a different solution. We claim that this is a Type I solution, i.e. that abed is coprime to n. 
As / divides 4a^(i + 1, / and with (5.4) also n is coprime to ad. As / and c are coprime 



by assumption, n is coprime to aed by (5.4). As 6 = {na + c)//, we conclude that n is also 
coprime to b. 

Thus it will suffice to show that there are at least 5N log^ A^ quadruples (a, c, d, /) G N^ 
with / coprime to 2ac, and d square-free with / dividing Aa'^d+ 1, d ^ /, and aed ^ A/4, for 
some absolute constant 5 > Q. 

We restrict a, c, / to be at most A"'^. If / is coprime to 2ac, then there is a unique primitive 
residue class of / such that 4a^(i -|- 1 is a multiple of / for all d in this class. Also, there are 
at least 5N/aef elements d of this residue class with d ^ f and acd ^ A/4 for some absolute 
constant (5 > 0; a standard sieving argument shows that a positive proportion of these elements 
are square-free. Thus, we have a lower bound of 

^p A 

^ acf 

a,c,/s£AfOi:(/,2ac)=l 

for the number of quadruples. Restricting / to be odd and then using the crude sieve 

l(/,2ac)=l >^-Yl ^Pl/^Pl« ~ Yl lpl/lp|c (5-5) 

P P 

where p ranges over odd primes, where 1e denotes the indicator function of a statement E 
(i.e. 1^; = 1 if £^ holds, and 1^; = otherwise), one easily verifies that the above expression is 
at least 6N log A for some absolute constant (5 > 0, and the claim follows. 
Now we establish the lower bound 

^fii(,p):>N\og^N. 



We will repeat the proof of (5.1), but because we are now counting primes instead of natural 
numbers we will need to invoke the Bombieri- Vinogradov inequality at a key juncture. 

Suppose o, c, d, e are natural numbers with d square-free, a,e,d^ N^-^, and e between A*^-^ 
and N/Aacd with 

p := Aaede — e — 4a d (5.6) 

prime. Then p is at most A^ and at least A^^'^, and in particular is automatically coprime 
to ade (and thus ce — a, by previous arguments). Thus, as before, each such {a,c,d,e) gives 
a Type II solution for a prime p ^ N, with different quadruples giving different solutions. 
Thus it suffices to show that there are at least 6N log A quadruples (a, c, d, e) with the above 
properties for some absolute constant 5 > Q. 
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Fix a,c,d. As e ranges from A^'^-^ to N/Aacd, the expression (5.6) traces out a primitive 
residue class modulo Aacd — 1, omitting at most 0{N^'^) members of this class that are less 
than N. Thus, the number of primes of the form (5.6) for fixed acd is 

7r(A^; Aacd - I, -Aa'^d) - 0{N^-^), 

where TT{N;q,t) denotes the number of primes p < N that are congruent to t mod q. We 
replace tt{N; Aacd — 1, —Aa^d) by a good approximation, and bound the error. If we set 



D{N] q) := max 

(a,q) = l 



Tr{N;q,a) 



h(iV) 



Hq) 



(as in (A. 13)), where li(x) := L dt/logt is the logarithmic integral, the number of primes of 
the form (5.6) for fixed acd is at least 



h(A) 



(l){Aacd - 1) 



D{N;4.acd-l)-0{N'''') 



The overall contribution of those acd combinations referring to the 0{N^'^) error term is at 
most 0((A°-^)3a°-*^) = o(Alog^ A), while h(A) is comparable to A/logA^, so it will suffice 
to show the lower bound 

E -...^y ,^ »log^iV (5.7) 



a,c,rf^AfO 



4>{4acd - 1] 



and the upper bound 



Y^ D{N;'iacd-l)=o{Nlog^N). 



(5.8) 



We first prove (5.7). Using the trivial bound (f){Aacd — 1) ^ 4acd, it suffices to show that 



E 



which upon factorising reduces to showing 



acd 



> log^ A 



E 



/.2(d) 



» log A. 



But this follows from Lemma lA. II 



Now we show (5.8). Writing q := 4acd — 1, we can upper bound the left-hand side of (5. 



somewhat crudely by 



J2 D{N;q)T{q+l) 



From divisor moment estimates (see (A.4)) we have 

V ^^i±^«log«(^)A; 
Q 
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hence by Cauchy-Schwarz, we may bound the preceding quantity by 
Using the trivial bound D{N; q) <^ N/q, we bound this in turn by 

^1/2 



But from the Bombieri- Vinogradov inequahty (|A.14), we have 



Y, D{N-q)^ANlog- 

q^NO-3 



■N 



for any A> 0, and the claim (5.8) follows 



Finally, we establish the lower bound 

J^/i(p)»iVlog2iV. 

Unsurprisingly, we will repeat many of the arguments from preceding cases. Suppose a, c, d, f 
are natural numbers with a,c,f ^ N^'^ with (a, c) = (2ac, /) = 1, N^'^ ^ d ^ N/Aac, such 
that / divides Aa'^d + 1, and the quantity 

p:=4:acd-f (5.9) 

is prime. Then p is at most A^ and is at least A^*^'^, and in particular is coprime to o, c, /; from 



(5.9) it is coprime to d also. This thus yields a Type I solution for p; by the coprimality of a, c, 
these solutions are all distinct as no two of the associated sextuples (a, b, c, d, Aa?d + 1//, /) 



can be related by (2.10). Thus it suffices to show that there are at least 5N log N quadruples 



(a, c, d, /) with the above properties for some absolute constant 5 > 0. 

For fixed a,c,f, the parameter d traverses a primitive congruence class modulo /, and 
p = Aacd— f traverses a primitive congruence class modulo 4ac/, that omits at most 0{N^'^) 



of the elements of this class that are less than N. By (A. 13), the total number of d that thus 
give a prime p for fixed acf is at least 

JiM_-«(A.;W)-0(A.«.«) 

and so by arguing as before it suffices to show the bounds 

1 



E 



ha,c)=i2ac,n=l^^^^^»log'N 



and 



Y D{N- 4.acf) = o{N log^ A^) . 

a,c,/s£AfOi 



But this is proven by a simple modification of the arguments used to establish (5.8 ), ( |5.7[) (th e 
constraints {a,c) = (2ac, /) = 1 being easily handled by an elementary sieve such as ( |5.5[ )). 
This concludes all the lower bounds for Theorem II. 1[ 
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6. Louver bounds II 



Here we prove Theorem 1.8 



Proof. For any natural numbers m, n, let g2{fn, n) denote the number of solutions {x, y) £ N^ 
to the Diophantine equation m/n = 1/x + 1/y. Since 

11111 
X y X 2y 2y 

we conclude the crude bound /(n) ^ 5(2(4, n) for any n. 

In [H Theorem 1] it was shown that g2{m,n) ^ 3'^ whenever n is the product of s distinct 
primes congruent to —1 mod m. Since g2{kn) ^ g2{n) for any /c, we conclude that 

/(n)^52(4,n)^^— (6.1) 

for all n, where Wm{n) is the number of distinct prime factors of n that are congruent to —1 
mod m. 

Now we prove the first part of the theorem. Let s be a large number, and let n be the 
product of the first s primes equal to —1 mod 4, then from the prime number theorem in 
arithmetic progressions we have logn = (l + o(l))slogs, and thus s = (l + o(l))logn/loglogn. 



Prom (6.1) we then have 

logn 



/(n)>exp log3(l + o(l)) 



log log n 



Letting s — >• 00 we obtain the claim. 

For the second part of the theorem, we use the Turan-Kubilius inequality (Lemma A. 2) to 
the additive function Wi^. This inequality gives that 



y^ \wi{n) - -loglogiVp < A^loglogiV. 



n^N 



From this and Chebyshev's inequality (see also [75j p. 307]), we see that 



WA{n) ^ - log log n + 0(C(n) -^/log log n) 



for all 71 in a density 1 subset of N. The claim then follows from (6.1). 

Now we turn to the third part of the theorem. We first deal with the case when p = 4t — 1 

is prime, then 

4 _ 4 1 

p ~ p+l t{At - 1) 

which in particular implies that 

/(P)^ 52(4, p+l) 

and thus 

/(p)>3"'*(P+^). 
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By Lemma A. 3 we know that 



wa{p+1) >\\- o(l)j loglogp (6.2) 

for all p in a a set of primes of relative prime density 1. 

It remains to deal with those primes p congruent to 1 mod 4. Writing 

4 1 3 

+ 



p (p + 3)/4 p(p + 3)/4 

we see that 

f{p) ^ 52(3,p(p + 3)/4) > 3"'^«P+3)/4) ^ 3^3(P+3)_ 

It thus suffices to show that 

-^^3(^ + 3) ^ ( 2 ~ °(^) ) loglogP 
for all p in a set of primes of relative density 1. But this can be established by the same 



techniques used to establish (6.2). 

D 

7. Sums of divisor functions 

Let P : Z — )• Z be a polynomial with integer coefficients, which for simplicity we will assume 
to be non-negative, and consider the sum 

rmN 

In [16l, Erdos established the bounds 



N log N <^pY^ r{P{n)) <p N log N (7.1) 



for all A^ > 1 and for P irreducible; note that the implied constants here can depend on both 
the degree and the coefficients of P. This is of course consistent with the heuristic T{n) ~ log n 
"on average". Of course, the irreducibility hypothesis is necessary as otherwise P{n) would 
be expected to have many more divisors. 

In this section we establish a refinement of the Erdos upper bound that gives a more precise 
description of the dependence of the implied constant on P (and with irreducibility replaced 
by a much weaker hypothesis), which may be of some independent interest: 

Theorem 7.1 (Erdos- type bound). Let N > 1, let P be a polynomial with degree D and 
coefficients being non-negative integers of magnitude at most N . For any natural number m, 
let p{m) be the number of roots of P mod m in TLjmTL, and suppose one has the bound 

Pip') ^ C (7.2) 

for all primes p and all j ^ 1. Then 

^r sr^ pirn) v-^ ,^, ,, ,^ v-^ p{m) 
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Remark 7.2. For any fixed P, one has (7.2) for some C = Cp (by many applications of 



Hensel's lemma, and treating the case of small p separately) , and when P is irreducible one 
can use tools such as Landau's prime ideal theorem to show that Ylm<N Pi^/''^ ^^ logA^ 
(indeed, much more precise asymptotics are available here). See f73^ for more precise bounds 
on C in terms of quantities such as the discriminant A(P) of P; bounds of this type go back to 
Nagell 14^1 and Ore f5l] / (see also W^ . fSE/)- One should in fact be able to establish a version 



of Theorem 7.1 in which the implied constant depends explicitly on the A(P) rather than on 
C by using the estimates of Henriot 125^ (which build upon earlier work of Barban- Vehov f^, 
Daniel 111], Shiu ]68^ . Nair \44h and Nair-Tenenbaum 145^ ), but we will not do so here, as 
we will need to apply this bound in a situation in which the discriminant may be large, but 
for which the bound C in ( |7.2[ ) can still be taken to be small. However, the version of Nair' s 
estimate given in |^ Theorem 2], having no explicit dependence on the discriminant, may be 
able to give an alternate derivation of Theorem \ 7. 1\ we thank the referee for this observation. 



Thus we see that Erdos' original result (7.1 ) is a corollary of Theorem 7.1. For special types 



of P (e.g. linear or quadratic polynomials), more precise asymptotics on X^„<jv''"(-f(^)) ^^"6 
known (see e.g. fTB/ . UOj for the linear case, and !^ . W^ . f^ . JJ^, \^ for the quadratic 
case), but the methods used are less elementary (e.g. Kloosterman sum bounds in the linear 
case, and class field theory in the quadratic case), and do not cover all ranges of coefficients of 
P for the applications to the Erdos-Straus conjecture. See also \53^ for another upper bound in 
the quadratic case which is uniform over large ranges of coefficients but gives weaker bounds 
(losing some powers of log N). 

Proof. Our argument will be based on the methods in [16j. In this proof all implied constants 
will be allowed to depend on D, I and C. 

We begin with the lower bound, which is very easy. Clearly 

T{P{n));, j; 1 (7.3) 

m^N:m\P{n) 

and thus 

n^N m^N nsiN:m\P{n) 

The expression P{n) mod m is periodic in n with period m, and thus for m ^ N one has 

nP^« y l«iV^^ (7.4) 

n^iV:m|P(n) 

which gives the lower bound on X^„<jv 't{P(j>))- 

Now we turn to the upper bound, which is more difficult. We first establish a preliminary 
bound 

^ T{P{n)f < N log^(i) N (7.5) 

using an argument of Landreau |35j . Let n ^ N. By the coefficient bounds on P we have 

P{n)'^NOW_ (7.6) 
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Using the main lemma from [35], we conclude that 

m'^N:m\P{n) 

and thus 

Y: r{P{n)f « Y r{mf^'^ ^ ^- 

n^N m^N n^N:m\P{n) 

Using (7^), we may crudely bound Y2n!iN:m\P{n) ^ ^ r(m)'^(^), thus 



Y r{P{n)f « Y ^("^ 



,o{i) 



n^N 



m<N 



and the claim then follows from Lemma |A.1[ 

In view of ( |7.5| ) and the Cauchy-Schwarz inequality, we may discard from the n summation 
any subset of {1, ... , N} of cardinality at most A^log^ N for sufficiently large C. We will 
take advantage of this freedom in the sequel. 

Suppose for the moment that we could reverse ( [7.3^ and obtain the bound 

T{P{n))« Y 1- (7.7) 

m^N:m\P{n) 



Combining this with (7.4), we would obtain 



E ^(^(-)) « E E 1 

m^N ns^N:m\P{n) 



n^N 






m^N 



m 



which would give the theorem. Unfortunately, while (7.7) is certainly true when P{n) ^ A^ , 



it can fail for larger values of P{n), and from the coefficient bounds on P we only have the 



weaker upper bound (7.6) 



Nevertheless, as observed by Erdos, we have the following substitute for (7.7): 
Lemma 7.3. Let C be a fixed constant. For all but at most 0{N log" N) values of n in 



the range 1 ^ n ^ N, either (7.7) holds, or one has 



T{P{n)) « 0(1)'^ E 1 

m&Sr-m\P{n) 

for some 2 ^ r <^ (log log A^)^, where Sr is the set of all m with the following properties: 

• m lies between N'^''^ and N. 

• m is N^'^ -smooth (i.e. m is divisible by any prime larger than N^'^ ). 

• m has at most (log log A^)^ prime factors. 

• m is not divisible by any prime power p'^ with p ^ N^'"^, k > 1, and p^ ^ j\ri/8(loglog^) _ 
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The point here is that the exponential loss in the 0(1)'^ factor will be more than compen- 
sated for by the A^^/'^-smooth requirement, which as we shall see gains a factor of 
some absolute constant c > 0. 



Proof. The claim follows from (7.7) when P(n) ^ N\ so we may assume that P{n) > A^ . 
We factorise P{n) as 

P(n) =pi...pj 

where the primes pi ^ ■ ■ . ^ pj are arranged in non-decreasing order. Let ^ j < J be the 
largest integer such that pi . . .pj ^ N. If j = then all prime factors of P{n) are greater than 



A^, and thus by (7.6) we have J = 0(1) and thus T{P(n)) = 0(1), which makes the claim 
(7.7) trivial. Thus we may assume that j ^ 1. 

Suppose first that all the primes Pj+i, ■ ■ ■ ,pj have size at least N^'"^. Then from (7.6) we 
in fact have J = j + 0(1), and so 

T{P{n))<.T{pi...pj). 



Note that every factor oi pi . . .pj divides P{n) and is at most N, which gives (7.7). Thus we 
may assume that Pj+i, in particular, is less than N^'"^, which forces 

iV^/2 <pi...pj ^ iV (7.8) 

and Pj < N'^/'^. 

Following [16) . we eliminate some small exceptional sets of natural numbers n. First we 
consider those n for which P{n) has at least (log log A^)^ distinct prime factors. For such P{n), 
one has T{P{n)) ^ 2'^°s'°s^) ^ which is asymptotically larger than any given power of log A^; 
thus by (7.5), the set of such n has size at most 0(A^log~ A^) and can be discarded. 

Next, we consider those n for which P{n) is divisible by a prime power p^ with p ^ N^'"^, 
k > 1, and p ^ jy'^/si^os^ogN) _ gy reducing k if necessary we may assume that p ^ A^. For 
each p and k, there are at most 0{{N/p^)p{p^)) = 0{N/p^) numbers n with P{n) divisible 
by p^ , thanks to (7.2); thus the total number of such n is bounded by 



p^ Ari/2 J ^2:pi ^ Ari/8(log log iV)2 

which can easily be computed to be 0{N \og~ N). Thus we may discard all n of this type. 

After removing all such n, we must have pj > A^i/®('°si°g^) . Indeed, after eliminating 
the exceptional n as above, pi ■ ■ -Pj is the product of at most (log log A^)^ prime powers, each 
of which is bounded by A^i/s(i°g'°s^) , or is a single prime larger than A^i/8(iogiogAf) _ "Yhe 
former possibility thus contributes at most A^^'^ to the final product pi . . .pj] from (7.8) we 
conclude that the latter possibility must occur at least once, and the claim follows. 

Let r be the positive integer such that 

Af^/("+i) < Pj ^ A^i/'-, 

then 2 ^ r <^ (log log A^)^. The primes Pj+i, ■ ■ ■ ,pj have size at least A^^/(^+^), so by (7.6) we 
have J = j + 0{r), which implies that 

r(P(n))«0(l)V(pi...pj). 
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As Pi . . .pj is at least N^'"^, we have 

T{pi...pj)^2 Yl 1^2 Y. 1- 

m\pi...pj;rn^{pi...pj)^^^ m\pi...pj;m'^N^/'^ 

Note that aU m in the above summand he in Sr and divide P{n). The claim follows. 
Invoking the above lemma, it remains to bound 

0{(loglog7V)2) 

E E 1 + E o(irE E !■ 

rm^N niiN:m\P{n) r=2 meS,- n^N:m\P{n) 



D 



by 0{N^^^]y P{m)/m). The first term was already shown to be acceptable by (7.4). For 
the second sum, we also apply (7.4) and bound it by 



0((loglogAr)2) 

m&Sr 



r=2 



p{m) 



m 



(7.9) 



To estimate this expression, let r, m be as in the above summation, and factor m into 



primes. As in the proof of Lemma 7.3, the contribution to m coming from primes less than 
^i/8{iogiogiV)2 jg ^^ ^Qg^ ^1/8^ ^^^ ^j^g primes larger than iVi/8(iogiog7V)2 ^-^^^ j-^j^^ ^ ^^^ ^-g, 

tinct. Hence, by the pigeonhole principle (as in [IS]), there exists t ^ 1 with r2* <^ (log log A^)^ 
such that the A^^'''-smooth number m has at least [rt/lOOj distinct prime factors between 
A^^'^ *" and N^'"^ "^ , and can thus be factored asm = qi . . . (/[rt/iooj^ where qi < . . . < qirt/wo] 
are primes between A^^" "^ and N^''^ '", and u is an integer of size at most A^. From the Chi- 



nese remainder theorem and (7.2) we have the crude bound 

p{m) < 0{iy'p{u) 



and thus 



E 



p{in) 



[rt/lOOj 



m 



«f2o{ir-, 



t=i 



I j±ii 



E 



E 



U 



By the standard asymptotic J2n<x ^/P ~ log log x + 0(1), we have 



E 



o(i); 



putting this all together, we can bound (7.9) by 



« EE 



0(1 



\rt 



r=2 t=l LlOOJ- 



E 



p{m) 



m 



and the claim follows. 



D 
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We isolate a simple special case of Theorem 7.1, when the polynomial P is linear: 
Corollary 7.4. If a,b,N are natural numbers with a,b <^ N^^' , then 

^ T{an + b) <^ T{{a, b))N log N 

where (a, b) is the greatest common divisor of a and b. 

Proof. By the elementary inequality T{nm) ^ T{n)T{m) we may factor out (a, b) and assume 
without loss of generality that a, b are coprime. 



We apply Theorem 7.1 with P{n) := an + b. From the coprimality of a, b and elementary 



modular arithmetic, we see that p{m) ^ 1 for all m, and the claim follows. D 

We may now prove Proposition |1.4| from the introduction. 



Proof of Proposition 1.^. We divide into two cases, depending on whether A'^ B oi A^ B. 



First suppose that A^ B. From Corollary |7.4| we have 



T{kab^ + 1) <^ V — < AlogA, 



y Tikab^ 

aCA m^A 



for each fixed b ^ B, and the claim follows on summing in B. (Note that this argument in 
fact works whenever A ^ B^ for any fixed e > 0.) 

Now suppose that A ^ B. For each fixed a G ^4, we apply Theorem |7.1| to the polynomial 
Pka{b) := kab'^ + 1. To do this we first must obtain a bound on pka{p'), where pkaim) is the 
number of solutions b mod m to kab"^ + 1 = mod m. Clearly pkaiin) vanishes whenever 
m is not coprime to ka, so it suffices to consider pkaip') when p does not divide ka. Then 
Pka is quadratic, and a simple application of Hensel's lemma reveals that Pkaip') ^ 2 for all 



odd prime powers p' and Pkaip') ^ 4 for p = 2. We may therefore apply Theorem 7.1 and 
conclude that 

Y,r{kab' + l)^BY, ^'^"^"'^ 
It thus suffices to show that 



•m 






To control PkaiTn), the obvious tool to use here is the quadratic reciprocity law (A.7). To 
apply this law, it is of course convenient to first reduce to the case when a and m are odd. If 
m = 2^m' for some odd m' , then pka{fn) <C Pkaii^'), and from this it is easy to see that the 



bound (7.10) follows from the same bound with m restricted to be odd. Similarly, by splitting 
a = 2 a' and absorbing the 2 factor into k (and dividing A by 2 to compensate), we may 
assume without loss of generality that a is odd. 

As previously observed, pkaii^T) vanishes unless ka and m are coprime, so we may also 
restrict to the case {ka, m) = 1, where (n, m) denotes the greatest common divisor of n, m. If 
p is an odd prime not dividing ka, then from elementary manipulation and Hensel's lemma 
we see that 

Pkaip') = Pkaip) ^ 1 + ( 

V P 
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and thus for odd m coprime to ka we have 



p.a(-)^n(i+(^)) 



p\m 

For odd m, not necessarily coprime to ka, we thus have 



.u»)< n (i+(^)) 



p\m;{p,2ka)=l 

using the multiphcativity properties of the Jacobi symbol, one has 

-ka^ 



, -ka 
1+ ( I ^ 



E 



P J .^ \ P^ 



whenever p \ m and {p, 2ka) = 1, and thus 



Pkaim) ^ n E {-jr) 



p\m;{p,2ka)=l j:pi \m 

The right-hand side can be expanded as 

■^-^ f —ka 

q\m\{q,2ka)=l 



We can thus bound the left-hand side of (7.10) by 



E E 



-ka 



E -• 



q^B:{q,2k)=l a^A;{a,2q)=l ^ ' m^B\q\m 

The final sum is of course (log — )/9 + 0(l/g). The contribution of the error term is bounded 

by 

which is acceptable, so it suffices to show that 



E E 



ka\ log 



B 



q J q 



<^A\ogB\og{l + k). (7.11) 



q^B:{q,2k) = l a^A;{a,2q) = l 

We first dispose of an easy contribution, when q is less than A. The expression 

-ka^ 



a I— )• 



{a,2q)=l 



is periodic with period 2q and sums to zero (being essentially a quadratic character on Z/2gZ), 
and so in this case we have 

E {=^)-oi,, 

a!iA;{a,2q)=l ^ ^ ^ 
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One could obtain better estimates and deal with somewhat larger q here by using tools such 
as the Polya- Vinogradov inequality, but we will not need to do so here; similarly for the 
treatment of the regime A ^ q ^ kA below. In any event, the contribution of the q < A case 
is bounded by 

/ log^\ 
oIJ2q -] =0{AlogB) 

which is acceptable. 

Next, we deal with the contribution when q is between A and kA. Here we crudely bound 
the Jacobi symbol in magnitude by 1 and obtain a bound of 






logB 



) = 0(^logBlog(l + A;)) 



which is acceptable. 

Finally, we deal with the case when q exceeds kA. We write k = T^k' where k' is odd, then 
from quadratic reciprocity ( A.7| ) (and (A. 8), (A. 9)) we have 



-ka 



c{q) 



k'a 



where c(q) := (—1)^'' i)/2+'"{g i)/8 jg periodic with period 8. We can thus rewrite this 
contribution to ( |7.11| ) as 



E 



E 



c{q) 



log 



as£A;(a,2)=l kA^g^B:{q,2ak)=l 

For any fixed a in the above sum, the expression 



k'a 



9 ^ ^^^) [k^J lfe2afc)=l 

is periodic with period 8k' a = 0{kA), is bounded in magnitude by 1 and has mean zero. A 
summation by parts then gives 



E 



kA^q^B:{g,2ak)=l 



M£"'' 



<Clog-B 



and so on summing in A we see that this contribution is acceptable. This concludes the proof 
of the proposition. D 



We now record some variants of Proposition 1.4 that will also be useful in our applications. 
Proposition 7.5 (Average value of r3(a6+ 1)). For any A,B > 1, one has 

^^r3(a6+l)<^51og2(yl + 5). (7.12) 



a^Ab^B 
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Proof. By symmetry we may assume that A ^ B, so that ab <^ B^ for all a ^ yl and b ^ B. 
For any n, t^ is the number of ways to represent n as the product n = did2d^ of three terms. 
One of these terms must be at most v}'^ , and so 



T2,{n) < y^ T 



.71, 



We can thus bound the left-hand side of (7.12) by 

« E E E -(^)- 

d<B2/3 a^A biiB:d\ab+l 

Note that for fixed a, d, the constraint d \ a6 + 1 is only possible if a is coprime to d, and 
restricts b to some primitive residue class q mod d for some q = qa^d between 1 and d. Writing 
b = cd + q, we can thus bound the above expression by 

« E E E ^K + r) 



where r = Va^d '■= (o-q + l)/d- Note that r is clearly coprime to a. Thus by Corollary 7.4, we 
may bound the preceding expression by 

« E Efi°.5B 

d<^B^/3 a<:A 

which is 0(yl-Blog B). The claim follows. D 

Proposition 7.6 (Average value of T{ab + cd)). For any A, B,C,D > 1, one has 

J2 T{ab + cd) <^ABCDlog{A + B + C + D). (7.13) 

a^AMB,c^C,d^D: 

{a,b,c,d)=l 

Proof. By symmetry we may assume that A,B,C ^ D. Then for fixed a,b,c coprime, we 
have 

y^ T{ab + cd) <^D log D 

d<:D 

by Corollary |7.4[ and the claim follows by summing in a, b, c, d. D 



Remark 7.7. Informally, one can view the above propositions as asserting that the heuristics 
T{n) <^ logn, T3{n) <C log n are valid on average (in a first moment sense) on the range 



of various polynomial forms in several variables. A result similar to Proposition 7.6 was 
established in \24\ Lemma 3], but with the coprimality condition (a, &, c, d) = 1 replaced by 
{ab, cd) = 1, and also the divisor function r being restricted by forcing one of the divisors to 
live in a given dyadic range, with the logarithm, being removed as a consequence. Also, products 
of three factors were permitted instead of the terms ab, cd. As remarked after \24\ Lemma 4], 



the logarithmic term in (7.13) is necessary. 
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8. Upper bound for Ens£Ar/l(^) and Ep^jv/l(p) 



Now that we have estabhshed Proposition 1.4 we can obtain upper bounds on sums of /j. 
We begin with the bound 



Y,fiin)<-Nlog^N. 



nCN 



By Proposition |2.2| and symmetry foUowed by Lemma 2.8, it suffices to show that there are at 
most 0{Nlog^JV) septuples {a,b,c,d,e, f,n) G N^ obeying (2.1)-(2.9) and the Type I estimates 
from Lemma |2.8[ In particular, acd <^ N, f is a factor of Aa^d + 1, and n = Aacd — f. As 
a, c, d, f determine the remaining components of the septuple, we may thus bound the number 
of such septuples as 

a,c,d:acd<^N 



Dividing a,c,d into dyadic blocks {A/2 ^ a ^ A, etc.) and applying Proposition 1.4 (with 
A: = 4) to each block, we obtain the desired bound 0(A^log N). 
Now we establish the bound 

Y^ flip) < A^log2 iVloglog A^. 

psgTV 

As before, it suffices to count quadruples (a, c, d, /) with acd <^ N, and / a factor of 4a'^d+ 1; 
but now we can restrict p = Aacd — / to be prime. Also, from Proposition |2.2| we may assume 
that p is coprime to acd (and hence to Aacd, if we discard the prime p = 2). 

Thus we may assume without loss of generality that — / mod 4a(i is a primitive residue 



class. From the Brun-Titchmarsh inequality (A. 10), we conclude that for each fixed a,d,f, 



there are 0(A/((/)(4a(i) log(A^/4a(i))) primes p in this residue class that are less than A^ if 
ad ^ A/100 (say); if instead ad > A/100, then we of course only have 0(1) = 0{N/(p{4:ad)) 
primes in this class. Thus, in any event, we can bound the number of such primes as 
0{N/{4>{4ad) log(2 + N/ad))). We therefore have the bound 

By dyadic decomposition (and bounding (jj^Aad) ^ (l){ad)), it thus suffices to show that 

a,d:N/2^ad^N ^^ ' 



Indeed, assuming this bound for all A, we can bound the right-hand side of (8.1) by 



and the claim follows. 



EA log A r, 
< A log2 A log log A 
i 
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To prove (8.2), we would like to again apply Proposition 1.4, but we must first deal with 



the (f){ad) denominator. From (A. 12) one has 



1 jLy-y-l 

iad) ad^^st 

^ ' s\a t\d 



Writing a = sa', d = td' , we may thus bound the left-hand side of (8.2) by 



s,t:st<:N a',d':a'd%N/st 



V + 1) 



Applying Proposition 1.4 to the inner sum (decomposed into dyadic blocks, and setting k 
4s^t), we see that 



J2 r{4sH{a'fd' + 1) « ^ log' ^ log(l + sh). 

a',d':a'd%N/st 



Inserting this bound and summing in s, t we obtain the claim. 



9. Upper bound for EnsSAf/n(^) ^^^ J2p<:N f'^dp) 
Now we prove the upper bound 

J]/ii(n)«Aflog3Af. 



n^N 



By Proposition 2.6 followed by Lemma |2.8| (and symmetry), it suffices to show that there are 
at most O(A^log^V) N-points (a, b, c, d, e, f) that lie in H^ for some n ^ N, which also obeys 
the Type II bound acde ^ N in Lemma [2. 8[ 



Observe from (2.13)-(2.21) that a,c,d,e determine the other variables b,f,n. Thus, it 
suffices to show that there are 0(A^log A^) quadruples {a,b,d,e) G N^ with acde ^ N. But 



this follows from (A.2) with k = A. 



Finally, we prove the upper bound 



J]/ii(p)«Alog2A. 

psgTV 



By dyadic decomposition, it suffices to show that 



Yl /ii(p)«Alog2A. 

iV/2^ps£Af 



(9.1) 



As before, we can bound the left-hand side (up to constants) by the number of quadruples 
(a, c, d, e) G N^ with acde <C N. However, by ( 2.16| ), we may also add the restriction that 
Aacde — Aa^d — e is a prime between A/2 and N . Also, if we set b := ce — a, then by Lemma 
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2.8 we may also add the restrictions a ^ b and b < ce, and from Proposition 2.6 we can also 



require that a, b be coprime. Since 



(ade){acd){ab)^^'^ < {ade){acd)b 

<C {ade){acd){ce) 
= (acde) 



-^N^ 



we see that one of the quantities ade, acd, ab must be at most 0{N^'^) (cf. Sections. As we 
shall soon see, the ability to take one of these quantities to be significantly less than A^ allows 



us to avoid the inefficiencies in the Brun-Titchmarsh inequality (A. 10) that led to a double 
logarithmic loss in the Type I case. (Unfortunately, it does not seem that a similar trick is 
available in the Type II case.) 

Let us first consider those quadruples with ade <C N^'^, which is the easiest case. For fixed 
o, d, e, Aacde — Aa^d — e traverses (a possibly non-primitive) residue class modulo 4ade. As 
ade <^ N'^'^, there are no primes in this class that are at least A^/2 if the class is not primitive. 



If it is primitive, we may apply the Brun-Titchmarsh inequality (A. 10) to bound the number 



of primes between A^/2 and N in this class by 0(y((^(4ade) log(A^))), noting that log(A^/4ade) 



/ 



is comparable to logA^. Thus, we can bound this contribution to the left-hand side of (9.1) 

by 

N y-^ 1 



logN ^ (j)Uacd)' 

a4,e:ade^N*/^ 

setting ra := ade and bounding (j){Aade) ^ (p{ade), we can bound this in turn by 

N y^ Mm) 



logA^ Z^ Mrn) 



where Tzijn) := ^a,dfi:ade=m 1- A-PPlying Lemma A.l we have 



J: 5M«log'iV, (9.2) 

and so this contribution is acceptable. 

Now we consider the case acd <C N^'^ . Here, we rewrite Aacde—Aa^d—e as {Aacd—l)e—Aa^d, 
which then traverses a (possibly non- primitive) residue class modulo Aacd — 1. Applying the 
Brun-Titchmarsh inequality as before, we may bound this contribution by 



N 






losN ^^ (biAacd-l) 

a,c,d:acd-t:N*/^ 

and hence (setting m := Aacd — 1) by 



< 



log N ^-^ (him) 
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so that it suffices to establish the bound 

(f){m 



sr^ T-i(m + 1) o , , 



m<iV4/5 



This is superficially similar to (9.2), but this time the summand is not multiplicative in m, 



and we can no longer directly apply Lemma A.l To deal with this, we apply (A. 12) and 



bound (9.3) by 

« E E^^^^^; 

^^ ^^ dm 

m<Af''/5 d\m 

writing m = dn, we can rearrange this as 

« E ^2 E ^^^^^"^^^ 

^^ d^ ^^^ n 



Applying dyadic decomposition of the d, n variables and using Proposition 7.5 we obtain (9.3 ) 
as required. 

Finally, we consider the case ab <^ N^'^. Here, we rewrite Aacde — 4a'^ d — e as 4abd — e, and 
note that e divides a + b = ce. If we fix a, b, there are thus at most T{a + b) choices for e (which 
also fixes c), and once one fixes such a choice, Aabd — e traverses a (possibly non-primitive) 
residue class modulo 4a6. Applying the Brun-Titchmarsh inequality again, we may bound 
this contribution by 

N ^ T{a + b) 

^ logiV ^ (j)(Aab) ■ 

Bounding (j){Aab) ^ (j){ab) and using (A.12[), we can bound this by 



log AT ^ Z^Z^ abkl 

a,b:ab<^N*/^]{a,b)=l k\a l\h 



Writing a = km, b = In, we may bound this by 



<^ T7 > -nn^ Tikm + ln) 

\ogN ^^ k^Pmn 

{k,l,m,n)=l 



Dyadically decomposing in k,l,m,n and using Proposition |7.6[ we see that this contribution 
is also 0(A^log N). The proof of (9.1) (and thus Theorem 1.1) is now complete. 



10. Solutions by polynomials 



We now prove Proposition 1.9 We first verify that each of the sets is solvable by polynomials 
(which of course implies that any residue class contained in such classes are also solvable by 
polynomials). We first do this for the Type I sets. In view of the 7r„ map (which clearly 
preserves polynomiality) , it will suffice to find polynomials a = a{n), . . . , / = f{n) of n that 
take values in N for sufficiently large n in these sets, and such that {a{n), . . . , f{n)) G Tj\ for 
all n. This is achieved as follows: 
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• If n = — / mod Aad, where a,d, f £ N are such that / | Aa^d + 1, then we take 

(a, b, c, d, e, f) := l^a, ^-^e -a,^^,d,e, —^ 

• If n = — / mod 4ac and n = —c/a mod /, where a, c, / G N are such that (4ac, /) = 1, 
then we take 

/ na + c n + f na + af + c \ 
{a,b,c,d,e,f) := (a, — ^ — 'C, ^^, — ,/ I ; 

note from the hypotheses that na + af + c is divisible by the coprime moduh / and c, 
and is thus also divisible by fc. 

• If n = — / mod Acd and n? = —Ac^d mod /, where c,d, f,q £ N are such that {Acd, f) = 
1, then we take 

fn + f n^+Ac^d + nf (n + ff+4c^ \ 

note from the hypotheses that (n + /)^ + 4c^d is divisible by the coprime moduli 4c^d 
and /, and is thus also divisible by A(?df . 

• If n = — 1/e mod 4a6, where a, 6, e G N are such that e\ a + h and (e, 4a6) = 1, then we 
take 

, , , „, / ,a + 6ne + l _^ a + hne + l 
[a,b,c,d,e,f) := la,b,—^, ^^^ ,e,4:a— — n 

One easily verifies in each of these cases that one has an N-point of Sj^ for n large enough. 

Now we turn to the Type II case. We use the same arguments as before, but using S„ in 
place of Sjj of course: 

• If n = — e mod 4a6, where a, 6, e G N are such that e \ a + b and (e, 4a6) = 1, then we 
take 

a + b n + e a + bn + e 



(a, 6, c, d, e, /) := a, b, , -r^, e, 1 

\ e Aab e b 

If n = —Aa^d mod /, where a, d, / G N are such that Aad | / + 1, then we take 

If n = — 4a^(i — e mod 4a(ie, where a, d, e G N are such that (4a(i, e) = 1, then we take 

, , , „, f n + e n + ia^d + e n + Aa?d 

(a, 6, c, d, e, /):=«, ^^, 4ade ' ^' ^' ^^ 
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Again, one easily verifies in each of these cases that one has an N-point of S„ for n large 
enough. 

Now we establish the converse claim. Suppose first that we have a primitive residue class q 
mod r that can be Type I solved by polynomials, and is maximal with respect to this property, 
then we have 



4 


1 1 1 


— = 


=-+-+- 


P 


X y z 



for all sufficiently large primes p in this class, where x = x{p),y = y{p),z = z(p) are polyno- 
mials of p that take natural number values for all large p in this class. For all sufficiently large 
p, we either have y{p) ^ z{p) for all p, or y{p) ^ z{p) for all p; by symmetry we may assume 
the latter. 



Applying Proposition 2.2, we see that 



(x, y, z) = {abdp, acd, bed) 

for some N-point (a, . . . , /) = {a{p), . . . , f{p)) in S^ with a{p), b{p), c{p) having no common 
factor. In particular, d = d{p) is the least common multiple of x{p),y{p), z{p). Applying the 
Euclidean algorithm to the polynomials x{p),y{p),z{p), we conclude that for sufficiently large 
p in the primitive residue class, d is also a polynomial in p, which divides the polynomials 
X, y, z. Dividing out by d and repeating these arguments, we conclude that a = a{p), b = b{p), 
and c = c{p) are also polynomials in p for sufficiently large p in the primitive residue class. 
Applying the identities (2.1)-(2.9) we also see that e = e{p) and / = f{p) are polynomials in 
p for sufficiently large p. 

Prom Lemma 2.8 we have a{p)c{p)d{p) = 0{p) and f{p) = 0{p) for all p, which implies 



that at least two of the polynomials a{p), c{p), d{p) must be constant in p, and that f{p) has 
degree at most 1 in p. We now divide into several cases. 



Pirst suppose that a,d are independent of p. By (2.7) this forces e,/ to be independent of 
p as well, and / divides Ao?d + 1. By (2.6) we have 



p = —f mod Aad 

for all sufficiently large primes p = q mod r and thus (by Dirichlet's theorem on primes in 
arithmetic progressions) the primitive residue class q mod r is contained in the residue class 
— / mod Aad, and the claim follows in this case. 

Now suppose that a, c are independent of p, and / has degree (i.e. is also independent 
of p). Then from (2.6) we have p = —f mod 4ac, and from (2.8) we have p = —c/a mod /; 
since p is a large prime this also forces (4ac, /) = 1, and the claim follows. 

Now suppose that a, c are independent of p, and / has degree p (and thus grows linearly in 
p). By Lemma 2.8, b, e are then bounded and thus constant in p. Prom ( |2.2[ ) we have e | a + 6, 
and from (2.1) we have p = — 1/e mod 4a6. As p is an arbitrarily large prime, this forces 
{4ab, e 



1, and the claim follows. 



Next, suppose that c,d are independent of p, and / has degree 0. Then from (2.6) one has 
p = —f mod Acd, which in particular forces {4cd, f) = 1. Prom (2.9) one has p'^ 
mod /, and the claim follows. 



-Ac'^d 



Pinally, suppose that c,d are independent of p, and / has degree 1. By (2.9), f{p) divides 
p'^+Ac^d for all large primes p in the primitive residue class. Applying the Euclidean algorithm, 
we conclude that / in fact divides p"^ + Ac^d as a polynomial in p. But as c, d are positive, 
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p^ + A(?d is irreducible over the reals, a contradiction. This concludes the treatment of the 
Type I case. 

We now turn to the Type II case. Let q mod r be a residue class that is Type II solvable by 
polynomials. Arguing as in the Type I case, we obtain a N-point (a, ...,/) = (a(p), . . . , f {p)) 
in Ep for all sufficiently large primes p in this class, and obeying the bounds in Lemma 2.8 
with a{p), . . . , f{p) all depending in a polynomial fashion on p. 



From Lemma 2.8 we have a{p)c{p)d{p)e{p) = 0{p), and so three of these polynomials 



a{p) , c{p) , d{p) , e{p) must be independent of p. 

Suppose ffist that a,c,e are independent of p. By ( |2.2[ ), b is independent of p also, and 
e \ a + b. By (2.13), p = —e mod iab, and thus (e,4a6) = 1, and the claim then follows from 



Dirichlet's theorem. 



Aad I / + 1. From (2.19) one has p 



Now suppose that a,c,d are independent of p. By (2.18), / is independent of p also, and 

Aa^d mod /, and the claim follows. 

-Aa^d — e mod Aade, 
1, and the claim follows. 



Next, suppose a,d,e are independent of p. By (2.16) one has p 
which implies {Aad, e 



Finally, suppose c,d,e are independent of p. By (2.14) this forces a,b to be bounded, and 



hence also independent of p; and so this case is subsumed by the preceding cases. 



11. Lowrer bounds III 



11.1. Generation of solutions We begin the proof of Theorem 1.11 ; the method of proof 



will be a generalisation of that in Section [5} For the rest of this section, m and k are fixed, 
and all implied constants in asymptotic notation are allowed to depend on m, k. We assume 
that N is sufficiently large depending on m,k. 

In the m = 4:,k = 3 case, Type II solutions were generated by the ansatz 

(^1,^2)^3) = {abd,acdn,bcdn) 
for various quadruples {a,b,c,d) (or equivalently, quadruples (a, c, d, e), setting b := ce — a); 



see (2.22). We will use a generalisation of this ansatz for higher k; for instance, when k = A 



we will construct solutions of the form 

(il, t2, ts, *4) = (tel2a;i23a^l24a;i234, a:i2X232;24a:^123a:;i24a:^234a:i234"-, bx21,Xl2ZX2MXl2?,4,n, bx2AXl2iX2MXl2?,in) 

for various octuples (6, xi2,X23, X24,xi23,a;i24, X234,xi234), or equivalently, using octuples 

(a;i2, a:23, 2;24, 2:123, a^i24, 2:234, 2:1234, e), 

and setting 

b = 6X232:242:234 - 2:12x242:124 - 2:12x232:123- 

More generally, we will generate Type II solutions via the following lemma. 
Lemma 11.2 (Generation of Type II solutions). Let V denote the set 2^^^ — 1-element set 

P:={/c{l,...,A;}:2G/;//{2}}. 
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Let {xi)j£-p be a tuple of natural numbers, and let e be another natural number, obeying the 
inequalities 

— iV ^ e TT x/ ^ — iV (11.1) 

2m -'■-'■ m 

lev 

and 

Kxi^ N'/^'-"' (11.2) 

whenever I & V. Suppose also that the quantity 

(11.3) 

IeV:I^{l,2} 

is square-free. Set 



and 





w : = 


XI 




l€V:Iy^\ 


[1,2} 


b 


= e XI — 


k 

E .... ^^ 




IeV:l0 


j=3 IeV:j0 


h 


= b XI 
lev-.iei 




n 


= mti — e 




t2 


= n xi 
lev 






tj := bn 


XI. 



(11.4) 

(11.5) 

(11.6) 
(11.7) 

(11.8) 
lev-.jei 

Then n is a natural number with n ^ N, and (ti, . . . ,tfc) is a Type II solution for this value 
of n. Furthermore, each choice of {xi)ii^-p and e generates a distinct Type II solution. 

Remark 11.3. In the m = 4,k = 3 case, the parameters xi are related to the coordinates 
(a, 6, c, d, e, f) appearing in Proposition\2.(\ by the formula 



(a, 6, c, d, e, /) = (xi2, b, X23, 2;i23, e, 4xi2X23a;i23 - 1); 

however, the constraint that a, 6, c have no common factor and abd is coprime to n has been 
replaced by the slightly different criterion that d is squarefree, which turns out to be more 



convenient for obtaining lower bounds (note that the same trick was also used to prove (5.1 )j. 



Parameterisations of this type have appeared numerous times in the previous literature (see 



23, \2^ [53 [7^ . or indeed Propositions 2.2, 2.6), though because most of these parameter- 



isations were focused on dealing with all solutions of a given type, as opposed to an easily 
countable subset of solutions, there were more parameters xi (indexed by all non-empty sub- 
sets 0/ {1, . . . , A;}, not just the ones in V), and there were some coprimality conditions on the 
xi rather than square-free conditions. 



Proof. Let the notation be as in the lemma. Then from (11.2) one has 

k 



Y, n xi^{k-2)N''-y^'^'<^N^/^' 

i=3 IeV:j0 
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while since 



n XI « iV2'=-V2'=- « ^1 



I&V 



we see from (11.1) that 



e » N''\ 



From (11.4) we then have that 



\e n xi^b^e n 



Xl 



IeV:10 



IeV:10 



and thus by (11.5) 



-e JJ x/ < ti ^ e JJ x/ 



lev 



lev 



and thus by (11.6) (noting that m ^ 4) 

-me \\ xj ^ n ^ me I I x/. 



lev 



lev 



These bounds ensure that 6, n, ti, . . . , t^ are natural numbers with n ^ N, and with with 

t2,---,tk divisible by n. Dividing (11.4) by 6f^^7e-p^/ ^^'^ using (11.5), (11.7), (11.8), we 

conclude that 

k 

i=3 *^ 



1 



e 
nti 



applying (11.6) one concludes that (ti, . . . , tfe) is a Type II solution. 



It remains to demonstrate that each choice of (x7-)/g-p and e generates a distinct Type II 
solution, or equivalently that the Type II solution {ti,...,tk) uniquely determines {xi)ii^p 



and e. To do this, first observe from (1.6) that (ti,. . . ,tfc) determines n, and from (11.6) we 



see that e is determined also. Next, observe from (11.5), (11.7), (11.8) that for any 3 ^ j ^ /c, 
one has 



t2tj_ 



n -^ 

.ieV:jei]i0 



n - 

\ieV:jei XOR i^/ 



(11.9) 



where XOR denotes the exclusive or operator; in particular, the left-hand side is necessarily a 
natural number. Note that all the factors xj appearing on the right-hand side are components 



of the square-free quantity w given by ( 11.3 ). We conclude that (n/eP-ve/i^/ ^-f)^ ^^ *^^ largest 
perfect square dividing ^^. We conclude that the Type II solution (ti, . . . , t^) determines all 
the products 

n ^i (11-10) 

ieV:jei-Mi 
for 3 ^ J ^ A:. Note (from the square-free nature of w) that the x/ with 10/ are all coprime. 



Taking the greatest common divisor of the (|11.10|) for all 3 ^ j ^ A;, we see that the Type II 
solution determines X|2,3,, 



,k}' 



Dividing this quantity out from all the expressions (11.10), and 



then taking the greatest common divisor of the resulting quotients for 4 ^ j ^ k, one recovers 
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X|2,4,...,fc}j a similar argument gives x/ for any I £ V with 1 / of cardinality k — 3. Dividing 
out these quantities and taking greatest common divisors again, one can then recover x/ for 
any I € V with 1 / of cardinality k — 4; continuing in this fashion we can recover all the xj 
with / G P and 1 /. 



Returning to (11.9), we can then recover the products H/ep-i lej ^/ ^^ ^^^ 3 ^ j ^ /c. 
Taking greatest common divisors iteratively as before, we can then recover all the xj with 
I £ V and 1 G /, thus reconstructing all of the data {xi)i,z-p and e, as claimed. D 

In view of this above lemma, we see that to prove ( |1.7| ) , it suffices to show that the number 
of tuples ((x/)/gp, e) obeying the hypotheses of the lemma is at least cA^(log A^)^ ~^ for an 
absolute constant c > 0. 



Observe that if we fix xj with I £ V obeying (11.2) and with the quantity w defined by 

N 
> 



(11.3), then there are 



11/g-p ^i 



choices of e that obey (11.1). Thus, noting that fJ-'^iw) ^ ^^(H/ep^/)) ^^^ number of tuples 
obeying the hypotheses of the lemma is 



»ivj: 



ll/g-p Xj 



(11.11) 



where the sum ^^ ranges over all choices of {xi)j£-p obeying the bounds (11.2). To estimate 



(11.11), we make use of |141 Theorem 6.4], which we restate as a lemma: 

Lemma 11.4. Let I ^ 1, and for each 1 ^ i ^l, let Oi < (3i he positive real numbers. Then 

2( ^ I 



E 



rii • • • n/ 



», (logiVyj](;3, 



di 



(11.12) 



i=l 



N°'i^niS^Nl^i for all l^i^l 

for N sufficiently large depending on I and the ai, . . . ,ai, Pi, . . . , f3i. 

From this lemma (and noting that there are 2^~^ — 1 parameters xj in the sum ^^) we 
see that 






(11.13) 



inserting this into (11.11) we obtain the claim. 



Now we prove (1.8). As in Section 5l the arguments are similar to those used to prove 



(1.7), but with the additional input of the Bombieri- Vinogradov inequality. 



As in the proof of (1.7), it suffices to obtain a lower bound (in this case, cA^(log N) 



2''"-^-2 



/log log N 



for some c > 0) on the number of tuples ((x/)/gp, e), but now with the additional constraint 
that the quantity 

p := mil — e = mb JT xj — e 



lav-.i&i 



IS prime. 
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Suppose we fix (x/)/gp obeying (11.2) with w squarefree. We may write 

p = qe + r 

where 

g := 771 1 J x/ — 1 
lev 
and 



(11.14) 



r :- 



ieV:iei j=3 ieV:j0 

Thus as e varies in the range given by ( |11.1[ ) , qe + r traces out an arithmetic progression of 
spacing q whose convex hull contains [0.6A^, 0.9A^] (say). Thus, every prime p in this interval 
[0.6A^, 0.9A^] that is congruent to r mod q will provide an e that will give a Type II solution 
with n = p prime, and different choices of {xi)j£-p and p will give different Type II solutions. 
For fixed {xj)ji=--p, if r is coprime to q, then we see from (A. 13) (and estimating li(a;) = 
(1 + o(l))x/loga;) that the number of such p is at least 



>c. 



N 



D{0.6N;q)-D{0.9N;q) 



log N(l){q) 
for some absolute constant c > 0. It thus suffices to show that 



E^'Hi 



N 



> 



iV(logiV)^ 



and 



for c = 0.6, 0.9. 



Y,D{cN 



(^''?)-MogiV(^(g) " log log A^ 
/iV(logA^)2'''-2\ 



log log N 



We first show (11.15). Since li(A^/100) is comparable to N/logN, and 



^11.15) 



(11.16) 



^ a <^ w, we 



may simplify (11.15) as 



E 



11/g-p Xj 



L(T-,g) = l > 



(logiV)^'"'-^ 
log log N 



:il.l7) 



The expression on the left-hand side is similar to (11.11), but now one also has the additional 
constraint lrj.„\^i. To deal with this constraint, we restrict the ranges of the xi parameters 
somewhat to perform an averaging in the a^{i,2} parameter (taking advantage of the fact that 
this parameter does not appear in the n'^{w) term). More precisely, we restrict to the ranges 
where 



XI 

(say) for //{I, 2}, and 

^{1,2} ^ N' 
We now analyse the constraint that r and q are coprime. We can factor 



^1/2*^+2 



(11.18) 
(11.19) 



-mX|-^ 2}'S 
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where 



s :- 



n ^0^ n ^^; 

^IeV:leI;I^{l,2} I j=3 IeV:j0;I^{l,2} 

the point is that s does not depend on a;|i_2}- Since 5 + 1 is divisible by ?7iX|i 2}, we see that 
mx'j-^ 2> ^® coprime to q, and thus {q, r) = 1 iff {q, s) = 1. We can write q = ux^i2} — 1, where 
u := mn/ep;/^{i,2}^/' and so {q,r) = 1 iff (mx{i^2} -!,«) = 1- 

We may replace s here by the largest square- free factor s' of s. If we then factor s' = vy, 
where v := (s',m) and y := s'/v, then nx|i2} — 1 is already coprime to v, and so we conclude 
that {q,r) = 1 iff (ux{i^2} - 1,2/) = 1- 

Fix x/ for / 7^ {1, 2}. By construction, n and y are coprime, and so the constraint {ux!i2} — 
l,y) = 1 restricts X{i^2} to i;^(y) distinct residue classes modulo y. Since 

,^1 /990fe 



(say) thanks to (11.18), we conclude that 



E 



1 



^^^^»^ log AT. 



''{1,2} 



C^l/2'=+2 



^{1,2} 



y 



Using the crude bound (A. 11), we may lower bound (j){y)/y ^ l/loglogA'^. (It is quite likely 
that by a finer analysis of the generic divisibility properties of y, one can remove this double 
logarithmic loss, but we will not attempt to do so here.) We may thus lower bound the 



left-hand side of (11.17) by 



logiV 
log log N 



E 



fi^{w) 



w 



where ^^^ sums over all xj for I 7^ {1, 2} obeying (|11.18). But by Lemma 11.4 we have 



E 



H^{w) 



w 



> (log N) 



2*-— -^-2 



and the claim (11.17) follows 



Finally, we show (11.16). Observe that each q can be represented in the form (|11.14|) in 
at most Tok-i_i{q + 1) different ways; also, from (11.2) we have q <C A^^ 



/2'=+2 ^ ^i/8_ ^^ 



may thus bound the left-hand side of ( |11.16 ) by 

Y, D{cN;q)T2,-i_^iq+l). 



g<iVl/8 



From the Bombieri- Vinogradov inequality (A. 14) and the trivial bound D{cN;q) ^ N/q one 
has 

Y, qD{cN;qf ^ANlog-^N 

g<Ari/8 

for any A > 0, while from Lemma |A.1| (and shifting q by 1) one has 



y ^2>^-^-l(g+l)% <l^gO(l)jV. 

, Q 

g<iVl/8 

The claim then follows from the Cauchy-Schwarz inequality (taking A large enough). The 



proof of Theorem 1.11 is now complete 
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A. Some results from number theory 

In this section we record some well-known facts from number theory that we will need 
throughout the paper. We begin with a crude estimate for averages of multiplicative functions. 

Now we record some asymptotic formulae for the divisor function r. From the Dirichlet 
hyperbola method we have the asymptotic 

"^ T{n) = N log N + 0{N) (A.l) 

(see e.g. |291 §1.5]). More generally, we have 

Y, Tkin) = N log''-' N + Ok{N log''-^ N) (A.2) 



for all k ^ 1, where Tk{n) := ^^ ^ .^ d =n -*■• Indeed, the left-hand side of (A.2) can be 
rearranged as 

E E - E 1 

dis^N d2^N/di dt,<^N/di...dk-i 

and the claim follows by evaluating each of the summations in turn. 
We can perturb this asymptotic: 

Lemma A.l (Crude bounds on sums of multiplicative functions). Let f{n) he a multiplicative 
function obeying the bounds 

f{p) = m + 0{-) 
P 

for all primes p and some integer m ^ 1, and 

for all primes p and j > 1. Then one has 

Y,fi'^)^mNlog"'''N 

for N sufficiently large depending on m; from this and summation by parts we have in partic- 
ular that 

j:^-^«^iog-N 

^-^ n 
If f is non-negative, we also have the corresponding lower bound 



Yfin):^mNlog"'-'N 



and hence 



^-^ n 
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One can of course get much better estimates by contour integration methods (and these 
estimates also follow without much difficulty from the more general results in [22]), but the 
above crude bounds will suffice for our purposes. 

Proof. We allow all implied constants to depend on m. By Mobius inversion, we can write 

/(n) = ^r„(ciM^) 
d\n 

where g is a multiplicative function obeying the bounds 

9{P) = OC-) 
and 

for all i > 1. In particular, the Euler product 

is absolutely convergent. 

We may therefore write '^n<N /("■) ^^ 

k^N di:N/k 



Applying (A.2), we conclude 



and the upper bound follows from the absolute convergence of Yl'^=i |5('^)I/'^- 

Now we establish the lower bound. By zeroing out / at various small primes p (and all 
their multiples), we may assume that f{p^)=g{p^) = for all p ^ w for any fixed threshold 
w. By making w large enough, we may ensure that 

E\q(n)\ 
^^^^^ >0. 
n 

n=2 



If we then insert the bound (A.2) into (A. 3) we obtain the claim. D 

As a typical application of Lemma |A.1| we have 

J2^Hn)'^kNlog''''-'N (A.4) 

for any A^ > 1 and k '^ 1, (see also [38]). 

To study some more detailed distribution of divisors and prime divisors we recall the 
Turdn-Kubilius inequality for additive functions. A function w is called additive, if U)(nin2) = 
w(ni) +w{n2), whenever gcd(ni,n2) = 1. 
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Lemma A. 2 (Turan-Kubilius inequality (see [B^, page 20)). Let tD : N — )• R denote an 
arithmetic function which is additive (thus w{nm) = w{n)+w{m) whenever n, m are coprime). 
Let A{N) := T.pk^Nw{p^)/p^ and D'^{N) := Epfc^Tv k(/)lVp^- For every N ^ 2 and for 
any additive function w the following inequality holds: 



^ \w{n) - A{N)\'^ ^ 30iVD^(iV). 



(Here ^ ^ denotes the sum over all prime powers.) 



Example. Let a;(n) denote the number of distinct prime factors of n, then ^(A^) = 
T.p>^^N^{P^)IP^ = loglogiV + 0(1) and D^N) = Ep^^N^ip'^f/p'' = M^) = loglogAr + 
0(1). The Turan-Kubilius inequality then gives 

Y^ |cj(n)-loglogA^P ^ 30A^loglogiV + O(7V). 

In particular, if ^(n) — )■ oo as n — )• oo, then one has \uj{n) — log log n| ^ ^(n)-^loglogn for all 
n in a set of integers of density 1 . For more details see [75] . 



From (A.l) one might guess the heuristic 

T{n) ~ logn (A. 5) 

on average. But it follows from the Turan-Kubilius inequality that for "typical" n, the number 
of divsors is about 2^°^'°s" = (logn)^°s^, which is considerably smaller, and that a small 
number of integers with an exceptionally large number of divisors heavily influences this 
average. The influence of these integers with a very large number of divsiors dominates even 
more for higher moments. The extremal cases heuristically consist of many small prime factors, 
and the following "divisor bound" holds 

T(n) ^ 2^^^°*'^^^i°g°i^°g" = 0(ni°si°g") (A.6) 

for any n ^ 1; see [5^ . 

The Turan-Kubilius type inequalities have been studied for shifted primes as well. We 
make use of the following result of Barban (see Elliott J12j . Theorem 12.10). 

Lemma A. 3. A function w; : N — >■ M^ is said to be strongly additive if it is additive and 
w{p^) = w{p) holds, for every prime power p^ , k ^ 1. Let w denote a real nonnegative 
strongly additive function. Define S{N) := J2p^N'^iP)/P ~ ^ and A{N) := iiiaxp<^iyw{p). 
Suppose that A{N) = o{S{N)), as N ^ oo. Then for any fixed e > 0, the prime density 

un{p; \w{p + 1) - S{N)\ > eS{N)) ^ as A ^ oo. 

The same holds for other shifts p + a, where a ^ 0. 

The function uj{n) is strongly additive. This lemma implies that for primes with relative 
prime density 1, p -|- 1 contains about 2 log log p primes of the form 1 mod 4. To see this one 
chooses w{p) = 1 if p = 1 mod 4, and otherwise. In this example one has S{N) ~ 2 log log A 
and A(A) = 1. 
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We recall the quadratic reciprocity law 

:) ^!L^ = (_i)(n-i)(— 1)/4 (A.7) 

for all odd m,n, where (^) is the Jacobi symbol, as well as the companion laws 

—) = (-l)("-i)/4 (A.8) 

n J 

and 

^^ (-l)("^-i)/8 (A.9) 

n ,' 

for odd n. 

For any primitive residue class a mod q and any A^ > 0, let 7r(A^; q, a) denote the number 
of primes p < N that are congruent to a mod q. We recall the Brun-Titchmarsh inequality 
(see e.g. |29l Theorem 6.6]) 

A^ 
7r(iV;g,a)« ^ (A.IO) 

m) log J 

for any such class with N ^ q. This bound suffices for upper bound estimates on primes in 
residue classes. Due to the q in the denominator of log{N/q), it will only be efficient to apply 
this inequality when q is much smaller than N, e.g. q ^ N^ for some c < 1. 

The Euler totient function (p(q) in the denominator is also inconvenient; it would be prefer- 
able if one could replace it with q. Unfortunately, this is not possible; the best bound on 
l/(j){q) in terms of q that one has in general is 

1 log log q , . , 

<C (A.ll) 



0(9) 



(see e.g. [SZj). Using this bound would simplify our arguments, but one would lose an 
additional factor of log log A or so in the final estimates. To avoid this loss, we observe the 
related estimate 

;^«-EJ- (a-12) 

m q ^ d 



Indeed, we have 



n 



(t){q) ^^p 



=n(i+-)(i+o(^)) 

p\q 



p p^ 

p\<l 



p\q 



d\q 
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and (A. 12) follows. (One could restrict d to be square-free here if desired, but we will not 
need to do so in this paper.) 

The Brun-Titchmarsh inequality only gives upper bounds for the number of primes in an 
arithmetic progression. To get lower bounds, we let D{N;q) denote the quantity 

h(iV) 



D{N; q) := max 
(a,g)=i 



TT{N;q,a) 



Hq) 



(A.13) 



where li(x) := L dt/logt is the logarithmic integral. The Bombieri- Vinogradov inequality 
(see e.g. |29l Theorem 17.1]) implies in particular that 

Y, D{N;q) «e,A iVlog-^iV. (A.14) 

We remark that the above inequality is usually phrased using the summatory von Mangoldt 
function ^(A^; q, a) = Yln^N-n=a mod q ^('^)- A summation by parts converts it to an estimate 
using the prime counting function; see [9j for details. 

for all < 6* < 1/2 and A > 0. Informally, this gives lower bounds on TT{N]q,a) on the 
average for q much smaller than N^''^. 
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